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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Atty.Dkt.No.: MOBT: 175-2 

Prior Application Examiner: 
E. Slobodyansky 

Classification Designation: 
Prior Group Art Unit: 1652 

BOX PATENT APPLICATION 

Assistant Commissioner for Patents 

Washington, D.C. 20231 

REQUEST FOR FILING DIVISIONAL APPLICATION 
UNDER 37 C.F.R. § 1.53fb) 

This is a request for filing a divisional application under Rule 53(b) (37 C.F.R. § 1.53(b)) 

of co-pending prior application Serial No. 09/137,440 filed August 20, 1998, entitled 

"GLYPHOSATE-TOLERANT 5-ENOLPYRUVYLSHIKIMATE-3-PHOSPHATE 

SYNTHASES." 

1X1 1- Enclosed is a copy of the prior application Serial No. 09/137,440 as originally 
filed, including specification, claims, drawings, and declaration. The undersigned 
hereby verifies that the attached papers are a true copy of the prior application as 
originally filed and identified above, that no amendments (if any) referred to in 
the declaration filed to complete the prior application introduced new matter 
therein, and further that this statement was made with the knowledge that willful 
false statements and the like so made are punishable by fine or imprisonment or 
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NUMBER EL392860338US " 
DATE OF DEPOSIT December 16, 1999 

I hereby certify that this paper or fee is being deposited with the United 
States Postal Service "EXPRESS MAIL POST OFFICE TO 
ADDRESSEE" service under 37 C.F.R. 1.10 on the date indicated 
above and is addressed to: Assistant Commissioner for Patents, 
Washington, DC 20231. 



Signature 
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both, under Section 1001 of Title 18 of the United States Code, and that such 
willful false statement may jeopardize the validity of the application or any patent 
issuing thereon. 

(a) ^| The inventorship is the same as prior Application Serial No. 
09/137,440. 

2. Enclosed is a check in the amount of $838.00 to cover the filing fee as calculated 

below and the fee for any new claims added in the Preliminary Amendment 

referred to in Clause No. 7 below. 

CLAIMS AS FILED IN THE PRIOR APPLICATION 
LESS CLAIMS CANCELED BELOW 



FOR NUMBER NUMBER RATE FEE 
FILED EXTRA 

Basic Fee $760.00 

Total Claims 8 - 20 = OX $18.00= $ 0.00 

Independent 4 - 3 = IX $78.00 = $ 78.00 

Claims 

Multiple Dependent Claim(s) $ -0-.00 

TOTAL FILING FEES: $838.00 



If the check is missing or insufficient, the Assistant Commissioner is hereby 
authorized to charge any fees under 37 C.F.R. §§ 1.16 to 1.21 which may be 
required for any reason relating to this application, or credit any overpayment to 
Arnold White & Durkee Deposit Account No. 01-2508/MOBT:175-2/PAT. 
Enclosed is a copy of the current Power of Attorney in the prior application. 
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5. Address all future communications to: 

Elizabeth Graf 

ARNOLD WHITE & DURKEE 
750 Bering Drive 
Houston, Texas 77057-2198 
(713)787-1400 

6. The prior application is presently assigned to Monsanto Company. 

7. Enclosed is a preliminary amendment. Any additional fees incurred by this 
amendment are included in the check at No. 2 above and said fee has been 
calculated after calculation of claims and after amendment of claims by the 
preliminary amendment. 

8. Amend the specification by inserting before the first line the sentence: —This is a 
divisional of co-pending application Serial No. 09/137,440 filed August 20, 
1998—. 

9. Enclosed are formal drawings. 

1 0. Transfer the sequence information, including the computer readable form previously 
submitted in the parent application, Serial No. 09/137,440 filed August 20, 1998, 
for use in this application. Under 37 C.F.R. § 1.821(e), Applicant states that the 
paper copy of the sequence listing in this application is identical to the computer 
readable copy in parent application Serial No. 09/137,440 filed August 20, 1998. 
Under 37 C.F.R. § 1.821(f), Applicant also states that the information recorded in 
computer readable form is identical to the written sequence listing. 
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[X] 11. Return Receipt Postcard (should be specifically itemized). 

Respectfully submitted, 



Christopher J. Buntel, Ph.D. 
Reg. No. 44,573 
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P.O. Box 4433 MONSANTO COMPANY 
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(713) 787-1400 



Date: December 16, 1999 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re Application of: 

Gerard F. Barry et al. 

Serial No.: To be Assigned 

Filed: December 16, 1999 

For: GLYPHOSATE-TOLERANT 5- 
ENOLPYRUVYLSHIKIMATE-3- 
PHOSPHATE SYNTHASES 



Group Art Unit: To be Assigned 

Examiner: To be Assigned 

Arty. Dkt. No. : MOBT: 1 75-2/PAT 



PRELIMINARY AMENDMENT 
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I hereby certify that this paper or fee is being deposited with the United 
States Postal Service "EXPRESS MAIL POST OFFICE TO 
ADDRESSEE" service under 37 C.F.R. 1.10 on the date indicated 
above and is addressed to: Assistant Commissioner for Patents, 
Washington, D.C. 20231 . 
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Assistant Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Please amend this application as follows: 
In The Specification 

At page 2, line 1, insert the following: 

—This is a divisional of co-pending application Serial No. 09/137,440, filed August 20, 

1998.-- 



In the Claims 

Cancel claims 1-100, without prejudice. 
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Please add the following new claims: 

-101. (Added) An antibody immunoreactive with a 5-enolpyravylshikimate-3-phosphate 
synthase enzyme, the enzyme comprising the sequence domains: 
-R-X1-H-X2-E- (SEQ ID NO:37), in which 

Xi is G, S, T, C, Y, N, Q 5 D or E; 

X2 is S or T; and 
-G-D-K-X3- (SEQ ID NO:38), in which 

X3 is S or T; and 
-S-A-Q-X4-K- (SEQ ID NO:39), in which 

X4 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y or V; and 
-N-X 5 -T-R-(SEQ ID NO:40), in which 

X 5 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y or V. 

102. (Added) The antibody of claim 101, wherein Xi is D or N; X 2 is S or T; X 3 is S or T; X4 
is V, I or L; and X 5 is P or Q. 

1 03 . (Added) The antibody of claim 101, wherein the enzyme comprises SEQ ID NO:3 . 

104. (Added) The antibody of claim 101, further defined as a polyclonal antibody. 

105. (Added) The antibody of claim 101, further defined as a monoclonal antibody. 

106. (Added) A method of detecting a 5-enolpyruvylshikimate-3-phosphate synthase enzyme 
in a sample, the method comprising: 

selecting a sample suspected of containing a 5-enolpyruvylshikimate-3-phosphate 
synthase enzyme; 

contacting the sample with an antibody to form an enzyme-antibody complex; and 
detecting the presence of the enzyme-antibody complex; wherein the antibody is 
immunoreactive with SEQ ID NO:3. 

107. (Added) A method of detecting a 5-enolpyruvylshikimate-3-phosphate synthase enzyme 
in plant cells or plant tissue, the method comprising: 
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selecting plant cells or plant tissue suspected of containing a 5-enolpyruvylshikimate-3- 

phosphate synthase enzyme; 
preparing a sample from the plant cells or plant tissue; 

contacting the sample with an antibody to form an enzyme-antibody complex; and 
detecting the presence of the enzyme-antibody complex; wherein the antibody is 
immunoreactive with SEQ ID NO:3. 
108. (Added) A kit for the detection of a 5-enolpyruvylshikimate-3-phosphate synthase 
enzyme in a sample, the kit comprising: 

a container comprising an antibody immunoreactive with SEQ ID NO:3; and 
a detection agent. -- 

REMARKS 

Claims 1-100 were initially filed in the parent case, application Serial No. 09/137,440 and 
thus have been canceled from this divisional application. The parent application was allowed on 
November 23, 1 999 but has not yet issued. 

The active claims in this case are claims 101-108. Added claims 101-108 correspond to 
cancelled claims 104-111 in the parent case. No new matter is introduced by the addition of 
claims 101-108. 

The specification has been amended to recite the relationship with the parent case, 
namely that it is a divisional application. 

It is believed that no fee is due; however, should any fees under 37 C.F.R. §§ 1.16 to 1.21 
be required for any reason, the Assistant Commissioner is authorized to deduct said fees from 
Arnold White & Durkee Deposit Account No. 01-2508/MOBT:175-2/PAT. 
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Respectfully submitted, 



ARNOLD, WHITE & DURKEE 

P.O. Box 4433 

Houston, Texas 77210-4433 

(713)787-1400 

Date: December 16, 1999 



Christopher J. Buntel, Ph.D. 
Reg. No. 44,573 
Agent for Assignee, 
Monsanto Company 
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rxLYPHOSATE-TOTBRANT 
5-ENOLPYRUVYLSHIKmATE.3-PHOSPHATE SYNTHASES 

> Fhis is a continuation-in-part of a copending U.S. patent application 
serial number 07/749,611, filed August 28, 1991 which is a continuation-in- 
part of U.SNpatent application serial number 07/576,537, filed August 31, 
1990, now abandoned. 



BACKGROUND OF THE INVENTION 



This invention relates in general to plant molecular biology and, more 
particularly, to a new class of glyphosate-tolerant 5-enolpyruvylshikimate-3- 
phosphate synthases. 

Recent advances in genetic engineering have provided the requisite tools 
to transform plants to contain foreign genes. It is now possible to produce 
plants which have unique characteristics of agronomic importance. Certainly, 
one such advantageous trait is more cost effective, environmentally 
compatible weed control via herbicide tolerance. Herbicide-tolerant plants 
may reduce the need for tillage to control weeds thereby effectively reducing 
soil erosion. 

One herbicide which is the subject of much investigation in this regard is 
N-phosphonomethylglycine commonly referred to as glyphosate. Glyphosate 
inhibits the shikimic acid pathway which leads to the biosynthesis of aromatic 
compounds including amino acids, plant hormones and vitamins. Specifically, 
glyphosate curbs the conversion of phosphoenolpyruvic acid (PEP) and 3- 
phosphoshikimic acid to 5-enolpyruvyl-3-phosphoshikimic acid by inhibiting 
the enzyme 5-enolpyruvylshikimate-3-phosphate synthase (hereinafter 
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referred to as EPSP synthase or EPSPS). For purposes of the present 
invention, the term "glyphosate" should be considered to include any 
herbicidally effective form of N-phosphonomethylglycine (including any salt 
thereof) and other forms which result in the production of the glyphosate anion 
in planta. 

It has been shown that glyphosate-tolerant plants can be produced by 
inserting into the genome of the plant the capacity to produce a higher level of 
EPSP synthase in the chloroplast of the cell (Shah et al., 1986) which enzyme 
is preferably glyphosate-tolerant (Kishore et al. 1988). Variants of the wild- 
type EPSPS enzyme have been isolated which are glyphosate-tolerant as a 
result of alterations in the EPSPS amino acid coding sequence (Kishore and 
Shah, 1988; Schulz et al., 1984; Sost et al., 1984; Kishore et al., 1986). These 
variants typically have a higher K$ for glyphosate than the wild-type EPSPS 
enzyme which confers the glyphosate-tolerant phenotype, but these variants 
are also characterized by a high K m for PEP which makes the enzyme 
kinetically less efficient (Kishore and Shah, 1988; Sost et al., 1984; Schulz et 
al., 1984; Kishore et al., 1986; Sost and Amrhein, 1990). For example, the 
apparent Km for PEP and the apparent Ki for glyphosate for the native EPSPS 
from E. coli are 10 (J.M and 0.5 uM while for a glyphosate-tolerant isolate 
having a single amino acid substitution of an alanine for the glycine at position 
96 these values are 220 uM and 4.0 mM, respectively. A number of 
glyphosate-tolerant plant variant EPSPS genes have been constructed by 
mutagenesis. Again, the glyphosate-tolerant EPSPS was impaired due to an 
increase in the Km for PEP and a slight reduction of the V max of the native plant 
enzyme (Kishore and Shah, 1988) thereby lowering the catalytic efficiency 
(V max /K m ) of the enzyme. Since the kinetic constants of the variant enzymes 
are impaired with respect to PEP, it has been proposed that high levels of 
overproduction of the variant enzyme, 40-80 fold, would be required to 
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maintain normal catalytic activity in plants in the presence of glyphosate 
(Kishoreei al, 1988). 

While such variant EPSP synthases have proved useful in obtaining 
transgenic plants tolerant to glyphosate, it would be increasingly beneficial to 
obtain an EPSP synthase that is highly glyphosate-tolerant while still 
kinetically efficient such that the amount of the glyphosate-tolerant EPSPS 
needed to be produced to maintain normal catalytic activity in the plant is 
reduced or that improved tolerance be obtained with the same expression level. 

Previous studies have shown that EPSPS enzymes from different 
sources vary widely with respect to their degree of sensitivity to inhibition by 
glyphosate. A study of plant and bacterial EPSPS enzyme activity as a 
function of glyphosate concentration showed that there was a very wide range 
in the degree of sensitivity to glyphosate. The degree of sensitivity showed no 
correlation with any genus or species tested (Schulz et al., 1985). Insensitivity 
to glyphosate inhibition of the activity of the EPSPS from the Pseudomonas sp. 
PG2982 has also been reported but with no details of the studies (Fitzgibbon, 
1988). In general, while such natural tolerance has been reported, there is no 
report suggesting the kinetic superiority of the naturally occurring bacterial 
glyphosate-tolerant EPSPS enzymes over those of mutated EPSPS enzymes 
nor have any of the genes been characterized. Similarly, there are no reports 
on the expression of naturally glyphosate-tolerant EPSPS enzymes in plants 
to confer glyphosate tolerance. 

For purposes of the present invention the term "mature EPSP 
synthase" relates to the EPSPS polypeptide without the N-terminal 
chloroplast transit peptide. It is now known that the precursor form of the 
EPSP synthase in plants (with the transit peptide) is expressed and upon 
delivery to the chloroplast, the transit peptide is cleaved yielding the mature 
EPSP synthase. All numbering of amino acid positions are given with respect 
to the mature EPSP synthase (without chloroplast transit peptide leader) to 
facilitate comparison of EPSPS sequences from sources which have 
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chloroplast transit peptides (i.e., plants and fungi) to sources which do not 
utilize a chloroplast targeting signal (i.e., bacteria). 

In the amino acid sequences which follow, the standard single letter or 
three letter nomenclature are used. All peptide structures represented in the 
following description are shown in conventional format in which the amino 
group at the N-terminus appears to the left and the carboxyl group at the C- 
terminus at the right. Likewise, amino acid nomenclature for the naturally 
occurring amino acids found in protein is as follows: alanine (Ala;A), asparagine 
(Asn;N), aspartic acid (Asp;D), arginine (Arg;R), cysteine (Cys;C), glutamic 
acid (Glu;E), glutamine (Gln;Q), glycine (Gly;G), histidine (His;H), isoleucine 
(Ile;I), leucine (Leu;L), lysine (Lys;K), methionine (Met;M), phenylalanine 
(Phe;F), proline (Pro;P), serine (Ser;S), threonine (Thr;T), tryptophan (Trp;W), 
tyrosine (Tyr;Y), and valine (Val;V). An "X" is used when the amino acid residue 
is unknown and parentheses designate that an unambiguous assignment is not 
possible and the amino acid designation within the parentheses is the most 
probable estimate based on known information. 

The term "nonpolar" amino acids include alanine, valine, leucine, 
isoleucine, proline, phenylalanine, tryptophan, and methionine. The term 
"uncharged polar*' amino acids include glycine, serine, threonine, cysteine, 
tyrosine, asparagine and glutamine. The term "charged polar" amino acids 
includes the "acidic" and "basic" amino acids. The term "acidic" amino acids 
includes aspartic acid and glutamic acid. The term "basic" amino acid includes 
lysine, arginine and histidine. The term "polar" amino acids includes both 
"charged polar"' and "uncharged polar" amino acids. 

Deoxyribonucleic acid (DNA) is a polymer comprising four 
mononucleotide units, dAMP (2'-Deoxyadenosine-5- monophosphate), dGMP 
(2'-Deoxyguanosine-5- monophosphate), dCMP < 2'-Deoxycytosine-5- 
monophosphate) and dTMP (2'-Deoxythymosine-5- monophosphate) linked in 
various sequences by 3\5'-phosphodiester bridges. The structural DNA 
consists of multiple nucleotide triplets called "codons" which code for the amino 
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acids. The codons correspond to the various amino acids as follows: Arg (CGA, 
CGC, CGG, CGT, AGA, AGG); Leu (CTA, CTC, CTG, CTT, TTA, TTG); Ser 
(TCA, TCC, TCG, TCT, AGC, AGT); Thr (ACA, ACC, ACG, ACT); Pro (CCA, 
CCC, CCG, CCT); Ala (GCA, GCC, GCG, GCT); Gly (GGA, GGC, GGG, GGT); 
lie (ATA, ATC, ATT); Val (GTA, GTC, GTG, GTT); Lys (AAA, AAG); Asn 
(AAC, AAT); Gin (QAA, CAG); ffis (CAC, CAT); Glu (GAA, GAG); Asp (GAC, 
GAT); Tyr (TAC, TAT); Cys (TGC, TGT); Phe (TTC, TTT); Met (ATG); and Trp 
(UGG). Moreover, due to the redundancy of the genetic code (i.e., more than one 
codon for all but two amino acids), there are many possible DNA sequences 
which may code for a particular amino acid sequence. 

SUMMARY OF THE INVENTION 

DNA molecules comprising DNA encoding kinetically efficient, 
glyphosate-tolerant EPSP synthases are disclosed. The EPSP synthases of 
the present invention reduce the amount of overproduction of the EPSPS 
enzyme in a transgenic plant necessary for the enzyme to maintain catalytic 
activity while still conferring glyphosate tolerance. The EPSP synthases 
described herein represent a new class of EPSPS enzymes, referred to 
hereinafter as Class II EPSPS enzymes. Class II EPSPS enzymes of the 
present invention usually share only between about 47% and 55% amino acid 
similarity or between about 22% and 30% amino acid identity to other known 
bacterial or plant EPSPS enzymes and exhibit tolerance to glyphosate while 
maintaining suitable Km (PEP) ranges. Suitable ranges of Km (PEP) for EPSPS 
for enzymes of the present invention are between 1-150 \iU, with a more 
preferred range of between 1-35 uM. and a most preferred range between 2-25 
^M. These kinetic constants are determined under the assay conditions 
specified hereinafter. An EPSPS of the present invention preferably has a K* 
for glyphosate range of between 15-10000 uM. The Ki/K m ratio should be 
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between about 2-500, and more preferably between 25-500. The V,^ of the 
purified enzyme should preferably be in the range of 2-100 units/mg 
(^moles/minute.mg at 25°C) and the K m for shikimate-3-phosphate should 
preferably be in the range of 0.1 to 50 01. 

Genes coding for Class II EPSPS enzymes have been isolated from five 
(5) different bapteria: Agrobacterium tumefaciens sp. strain CP4, 
Achromobacter sp. strain LBAA, Pseudomonas sp. strain PG2982, Bacillus 
subtilis, and Staphylococcus aureus. The LBAA and PG2982 Class II EPSPS 
genes have been determined to be identical and the proteins encoded by these 
two genes are very similar to the CP4 protein and share approximately 84% 
amino acid identity with it. Class II EPSPS enzymes often may be 
distinguished from Class I EPSPS's by their inability to react with polyclonal 
antibodies prepared from Class I EPSPS enzymes under conditions where 
other Class I EPSPS enzymes would readily react with the Class I antibodies 
as well as the presence of certain unique regions of amino acid homology which 
are conserved in Class II EPSP synthases as discussed hereinafter. 

Other Class II EPSPS enzymes can be readily isolated and identified by 
utilizing a nucleic acid probe from one of the Class II EPSPS genes disclosed 
herein using standard hybridization techniques. Such a probe from the CP4 
strain has been prepared and utilized to isolate the Class II EPSPS genes from 
strains LBAA and PG2982. These genes may also optionally be adapted for 
enhanced expression in plants by known methodology. Such a probe has also 
been used to identify homologous genes in bacteria isolated de novo from soil. 

The Class II EPSPS enzymes are preferably fused to a chloroplast 
transit peptide (CTP) to target the protein to the chloroplasts of the plant into 
which it may be introduced. Chimeric genes encoding this CTP-Class II EPSPS 
fusion protein may be prepared with an appropriate promoter and 3' 
polyadenylation site for introduction into a desired plant by standard methods. 

To obtain the maximal tolerance to glyphosate herbicide it is preferable 
to transform the desired plant with a plant-expressible Class II EPSPS gene in 
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conjunction with another plant-expressible gene which expresses a protein 
capable of degrading glyphosate such as a plant-expressible gene encoding a 
glyphosate oxidoreductase enzyme as described in PCT Application No. 
WO 92/00377, the disclosure of which is hereby incorporated by reference. 

Therefore, in one aspect, the present invention provides a new class of 
EPSP synthases that exhibit a low K m for phosphoenolpyruvate (PEP), a high 
V m ax/Kni ratio, and a high Ki for glyphosate such that when introduced into a 
plant, the plant is made glyphosate-tolerant such that the catalytic activity of 
the enzyme and plant metabolism are maintained in a substantially normal 
state. For purposes of this discussion, a highly efficient EPSPS refers to its 
efficiency in the presence of glyphosate. 

More particularly, the present invention provides EPSPS enzymes 
having a K m for phosphoenolpyruvate (PEP) between 1-150 jllM and a 
Ki(glyphosate)/K m (PEP) ratio between 3-500, said enzymes having the 
sequence domains: 

-R-Xi-H-X 2 -E- (SEQ ID NO:37), in which 

Xi is an uncharged polar or acidic amino acid, 

X2 is serine or threonine; and 
-G-D-K-X3- iSEQ ID NO:38), in which 

X3 is serine or threonine; and 
-S-A-Q-X4-K- (SEQ ID NO:39), in which 

X4 is any amino acid; and 
-N-X5-T-R- (SEQ ID:40), in which 

X5 is any amino acid. 

Exemplary Class II EPSPS enzyme sequences are disclosed from seven 
sources: Agrobacterium sp. strain designated CP4, Achromobacter sp. strain 
LBAA, Pseudomonas sp. strain PG2982, Bacillus subtilis 1A2, Staphylococcus 
aureus (ATCC 35556), Synechocystis sp. PCC6803 and Dichelobacter nodosus. 
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In another aspect of the present invention, a double-stranded DNA 
molecule comprising DNA encoding a Class II EPSPS enzyme is disclosed. 
Exemplary Class II EPSPS enzyme DNA sequences are disclosed from seven 
sources: Agrobacterium sp. strain designated CP4, Achromobacter sp. strain 
LBAA, Pseudomonas sp. strain PG2982, Bacillus subtilis 1A2, Staphylococcus 
aureus (ATCC 35556), Synechocystis sp. PCC6803 and Dichelobacter nodosus. 

In a further aspect of the present invention, nucleic acid probes from 
EPSPS Class II genes are presented that are suitable for use in screening for 
Class II EPSPS genes in other sources by assaying for the ability of a DNA 
sequence from the other source to hybridize to the probe. 

In yet another aspect of the present invention, a recombinant, double- 
stranded DNA molecule comprising in sequence: 

a) a promoter which functions in plant cells to cause the production of an 
RNA sequence; 

b) a structural DNA sequence that causes the production of an RNA 
sequence which encodes a Class II EPSPS enzyme having the sequence 
domains: 

-R-X1-H-X2-E- (SEQ ID NO:37), in which 

Xi is an uncharged polar or acidic amino acid. 

X2 is serine or threonine; and 
-G-D-K-X3- (SEQ ID NO:38), in which 

X3 is serine or threonine; and 
-S-A-Q-X4-K- (SEQ ID NO:39), in which 

X4 is any amino acid; and 
-N-Xs-T-R- (SEQ ID:40), in which 

X5 is any amino acid: and 

c) a 3' nontranslated region which functions in plant cells to cause the 
addition of a stretch of polyadenyl nucleotides to the 3' end of the RNA 
sequence 
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where the promoter is heterologous with respect to the structural DNA 
sequence and adapted to cause sufficient expression of the EPSP synthase 
polypeptide to enhance the glyphosate tolerance of a plant cell transformed 
with said DNA molecule. 

In still yet another aspect of the present invention, transgenic plants 
and transformed Riant cells are disclosed that are made glyphosate-tolerant by 
the introduction of the above-described plant-expressible Class II EPSPS DNA 
molecule into the plant's genome. 

In still another aspect of the present invention, a method for selectively 
controlling weeds in a crop field is presented by planting crop seeds or crop 
plants transformed with a plant-expressible Class II EPSPS DNA molecule to 
confer glyphosate tolerance to the plants which allows for glyphosate 
containing herbicides to be applied to the crop to selectively kill the glyphosate 
sensitive weeds, but not the crops. 

Other and further objects, advantages and aspects of the invention will 
become apparent from the accompanying drawing figures and the description 
of the invention. 

RRTRF DESCRIPTION OF THE DRAWINGS 

^Figti^.'Show^ the DNA sequence (SEQ ID NO:l) for the full-length 
promoter of figwort mosaic virus (FMV35S). 

Figure ^shows the jcosrnicVcloniiig vector pMON17020. 



shows tfte structural DNA sequence (SEQ ID NO:2) for the 
Class II EPSPS gene from bacterial isolate Agrobacterium sp. strain CP4 and 
the deduced amino add sequence (SEQ ID NO:3). 

Eigl^fthows tie structural DNA sequence (SEQ ID NO:4) for the 
Class II EPSPS gene from the bacterial isolate Achromobacter sp. strain 
LBAA and the deduced amino acid sequence (SEQ ID NO:5). 
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\ Eigurtr5 shows the structural DNA sequence (SEQ ID NO:6) for the 

/*■ 

Class II EPSPS gene from the bacterial isolate Pseudomonas sp. strain 
PG2982 f and^dedgce^iino acid sequence (SEQ ID NO:7). 

^ ^FtguYe^o*- shows the Bestfit comparison of the CP4 EPSPS amino acid 
sequence^ (SEQ ID^Chj^with that for the E. coli EPSPS (SEQ ID NO:8). 

\r Rgore^-sfeews the Bestfit comparison of the CP4 EPSPS amino acid 




synthetic CP4 Class II EPSPS gene. 

Figure 9 shows the DNA sequence (SEQ ID NO: 10) of the chloroplast 
transit peptide (CTP) and encoded amino acid sequence (SEQ ID NO: 11) 
derived from the Arabidopsis thaliana EPSPS CTP and containing a Sphl 
restriction site at the chloroplast processing site, hereinafter referred to as 

• : Pigttre-40-shows the DNA sequence (SEQ ID NO: 12) of the chloroplast 
transit peptide and encoded amino acid sequence (SEQ ID NO:13) derived from 
the Arabidopsis thaliana EPSPS gene and containing an £coRI restriction site 
within the mature region of the EPSPS, hereinafter referred to as CTP3. 

' Figure 11 shows the DNA sequence (SEQ ID NO: 14) of the chloroplast 
transit peptide and encoded amino acid sequence (SEQ ID NO:15) derived from 
the Petunia hybrida EPSPS CTP and containing a Sphl restriction site at the 
chloroplast processing site and in which the amino acids at the processing site 
are changed to -Cys-MeK hereinafter referred to as CTP4. 

#§orS^£4 the DNA sequence (SEQ ID NO:16) of the chloroplast 
transit peptide and encoded amino acid sequence (SEQ ID NO:17) derived from 
the Petunia hybrida EPSPS gene with the naturally occurring EcoRl site in the 
mature region of the EPSPS gene, hereinafter referred to as CTP5. 

Figure 13 shows a plasmid map of CP4 plant transformation/ expression 
vector pMON17110. 
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Figure 14 shows a plasmid map of CP4 synthetic EPSPS gene plant 
transformation/expression vector pMON17131. 

Figure 15 shows a plasmid map of CP4 EPSPS free DNA plant 
transformation expression vector pMON13640. 

Figure 16 shows a plasmid map of CP4 plant transformation/direct 
selection vector pMON17227. 

Figure 17 shows a plasmid map of CP4 plant transformation/expression 
vector pMON19653, r ^ ^ j 

* FigWe-i§Lshews the structural DNA sequence (SEQ ID NO:41) for the 
Class II EPSPS gene from the bacterial isolate Bacillus subtilis and the 
deduced/ammo^acM ^™jnce (SEQ ID NO:42). 



jws tne structural DNA sequence (SEQ ID NO:43) for the 
Class II EPSPS gene from the bacterial isolate Staphylococcus aureus and the 
deduced,amino .acid sequenc^j^EQ ID NO:44). 

! s > j l g ule'lo bhuw& the Bestfit comparison of the representative Class II 
EPSPS amino acid sequences Pseudomonas sp. strain PG2982 (SEQ ID NO:7), 
Achromobacter sp. strain LBAA (SEQ ID NO:5), Agrobacterium sp. strain 
designated CP4 (SEQ ID NO:3), Bacillus subtilis (SEQ ID NO:42), and 
Staphylococcus aureus (SEQ ID NO:44) with that for representative Class I 
EPSPS amino acid sequences [Sacchromyces cerevisiae (SEQ ID NO:49), 
Aspergillus nidulans (SEQ ID NO:50), Brassica napus (SEQ ID NO:51), 
Arabidopsis thaliana (SEQ ID NO:52\ Nicotina tobacum (SEQ ID NO:53), L. 
esculentum (SEQ ID NO:54), Petunia hybrida (SEQ ID NO:55), Zea mays (SEQ 
ID NO:56), Solmenella gallinarum (SEQ ID NO:57\ Solmenella typhimurium 
(SEQ ID NO:58), Solmenella typhi (SEQ ID NO:65), E. coli (SEQ ID NO:8), K. 
pneumoniae (SEQ ID NO:59), Y. enterocolitica (SEQ ID NO:60), H. influenzae 
(SEQ ID NO:61), P. multocida (SEQ ID NO:62\ Aeromonas salmonicida (SEQ 
ID NO:63), Bacillus pertussis (SEQ ID NO:64)] and illustrates the conserved 
regions among Class II EPSPS sequences which are unique to Class II EPSPS 
sequences. To aid in a comparison of the EPSPS sequences, only mature 
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EPSPS sequences were compared. That is, the sequence corresponding to the 
chloroplast transit peptide, if present in a subject EPSPS, was removed prior 

- Figure-^shews the structural DNA sequence (SEQ ID NO:66) for the 
Class II EPSPS gene from the bacterial isolate Synechocystis sp. PCC6803 and 

the (SEQ ro NO:67) - 

' jV Figu¥e^2^hows the structural DNA sequence (SEQ ID NO:68) for the 
Class II EPSPS gene from the bacterial isolate Dichelobacter nodosus and the 
deduced amino acid seauence^SEQ ID NO:69). 

Figrafc-St^-shews the Bestfit comparison of the representative Class II 
EPSPS amino acid sequences Pseudomonas sp. strain PG2982 (SEQ ID NO:7), 
Achromobacter sp. strain LBAA (SEQ ID NO:5), Agrobacterium sp. strain 
designated CP4 (SEQ ID NO:3), Synechocystis sp. PCC6803 (SEQ ID NO:67), 
Bacillus subtilis (SEQ ID NO:42), Dichelobacter nodosus (SEQ ID NO:69) and 
Staphylococcus aureus (SEQ ID NO:44). 

Figure 24 a plasmid map of canola plant transformation/expression 
vector pMON17209. 

Figure 25 a plasmid map of canola plant transformation/expression 
vector pMON17237. 

STATEMENT OF THE INVENTION 

The expression of a plant gene which exists in double-stranded DNA 
form involves synthesis of messenger RNA (mRNA) from one strand of the 
DNA by RNA polymerase enzyme, and the subsequent processing of the 
mRNA primary transcript inside the nucleus. This processing involves a 
3' non-translated region which adds polyadenylate nucleotides to the 3* end of 
the RNA. 

Transcription of DNA into mRNA is regulated by a region of DNA 
usually referred to as the "promoter." The promoter region contains a sequence 
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of bases that signals RNA polymerase to associate with the DNA, and to 
initiate the transcription into mRNA using one of the DNA strands as a 
template to make a corresponding complementary strand of RNA. A number 
of promoters which are active in plant cells have been described in the 
literature. These include the nopaline synthase (NOS) and octopine synthase 
(OCS) promoters, (which are carried on tumor-inducing plasmids of 
Agrobacterium tumefaciens), the cauliflower mosaic virus (CaMV) 19S and 
35S promoters, the light-inducible promoter from the small subunit of ribulose 
bis-phosphate carboxylase (ssRUBISCO, a very abundant plant polypeptide) 
and the full-length transcript promoter from the figwort mosaic virus 
(FMV35S), promoters from the maize ubiquitin and rice actin genes. All of 
these promoters have been used to create various types of DNA constructs 
which have been expressed in plants; see, e.g., PCT publication WO 84/02913 
(Rogers et al., Monsanto). 

Promoters which are known or found to cause transcription of DNA in 
plant cells can be used in the present invention. Such promoters may be 
obtained from a variety of sources such as plants and plant DNA viruses and 
include, but are not limited to, the CaMV35S and FMV35S promoters and 
promoters isolated from plant genes such as ssRUBISCO genes and the maize 
ubiquitin and rice actin genes. As described below, it is preferred that the 
particular promoter selected should be capable of causing sufficient expression 
to result in the production of an effective amount of a Class II EPSPS to 
render the plant substantially tolerant to glyphosate herbicides. The amount of 
Class II EPSPS needed to induce the desired tolerance may vary with the 
plant species. It is preferred that the promoters utilized have relatively high 
expression in all meristematic tissues in addition to other tissues inasmuch as 
it is now known that glyphosate is translocated and accumulated in this type 
of plant tissue. Alternatively, a combination of chimeric genes can be used to 
cumulatively result in the necessary overall expression level of the selected 
Class II EPSPS enzyme to result in the glyphosate-tolerant phenotype. 
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The mKNA produced by a DNA construct of the present invention also 
contains a 5* non-translated leader sequence. This sequence can be derived 
from the promoter selected to express the gene, and can be specifically 
modified so as to increase translation of the mRNA. The 5' non-translated 
regions can also be obtained from viral RNAs, from suitable eukaryotic genes, 
or from a synthetic gene sequence. The present invention is not limited to 
constructs, as presented in the following examples, wherein the non-translated 
region is derived from both the 5' non-translated sequence that accompanies 
the promoter sequence and part of the 5' non-translated region of the virus 
coat protein gene. Rather, the non-translated leader sequence can be derived 
from an unrelated promoter or coding sequence as discussed above. 

Preferred promoters for use in the present invention the the full-length 
transcript (SEQ ID NO.l) promoter from the figwort mosaic virus (FMV35S) 
and the full-length transcript (35S) promoter from cauliflower mosaic virus 
(CaMV), including the enhanced CaMV35S promoter (Kay et al. 1987). The 
FMV35S promoter functions as strong and uniform promoter with particularly 
good expression in meristematic tissue for chimeric genes inserted into plants, 
particularly dicotyledons. The resulting transgenic plant in general expresses 
the protein encoded by the inserted gene at a higher and more uniform level 
throughout the tissues and cells of the transformed plant than the same gene 
driven by an enhanced CaMV35S promoter. Referring to Figure 1, the DNA 
sequence (SEQ ID NO:l) of the FMV35S promoter is located between 
nucleotides 6368 and 6930 of the FMV genome. A 5' non-translated leader 
sequence is preferably coupled with the promoter. The leader sequence can be 
from the FMV35S genome itself or can be from a source other than FMV35S. 

For expression of heterologous genes in moncotyledonous plants the use 
of an intron has been found to enhance expression of the heterologous gene. 
While one may use any of a number of intrbns which have been isloated from 
plant genes, the use of the first intron from the maize heat shock 70 gene is 
preferred. 
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The 3' non-translated region of the chimeric plant gene contains a 
polyadenylation signal which functions in plants to cause the addition of 
polyadenylate nucleotides to the 3' end of the viral RNA. Examples of suitable 
3' regions are (1) the 3' transcribed, non-translated regions containing the 
polyadenylated signal of Agrobacterium tumor-inducing (Ti) plasmid genes, 
such as the nopaline synthase (NOS) gene, and (2) plant genes like the 
soybean storage protein genes and the small subunit of the ribulose-1,5- 
bisphosphate carboxylase (ssRUBISCO) gene. An example of a preferred 
3' region is that from the ssRUBISCO gene from pea (E9), described in greater 
detail below. 

The DNA constructs of the present invention also contain a structural 
coding sequence in double-stranded DNA form which encodes a glyphosate- 
tolerant, highly efficient Class II EPSPS enzyme. 

Identification of glvphosate-tolerant. highly efficient EPSPS enzymes 

In an attempt to identify and isolate glvphosate-tolerant, highly efficient 
EPSPS enzymes, kinetic analysis of the EPSPS enzymes from a number of 
bacteria exhibiting tolerance to glyphosate or that had been isolated from 
suitable sources was undertaken. It was discovered that in some cases the 
EPSPS enzymes showed no tolerance to inhibition by glyphosate and it was 
concluded that the tolerance phenotype of the bacterium was due to an 
impermeability to glyphosate or other factors. In a number of cases, however, 
microorganisms were identified whose EPSPS enzyme showed a greater degree 
of tolerance to inhibition by glyphosate and that displayed a low K m for PEP 
when compared to that previously reported for other microbial and plant 
sources. The EPSPS enzymes from these microorganisms were then subjected 
to further study and analysis. 

Table I displays the data obtained for the EPSPS enzymes identified and 
isolated as a result of the above described analysis. Table I includes data for 
three identified Class II EPSPS enzymes that were observed to have a high 
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tolerance to inhibition to glyphosate and a low Km for PEP as well as data for 
the native Petunia EPSPS and a glyphosate-tolerant variant of the Petunia 
EPSPS referred to as GA101. The GA101 variant is so named because it 
exhibits the substitution of an alanine residue for a glycine residue at position 
101 (with respect to Petunia). When the change introduced into the Petunia 
EPSPS (GA101) was introduced into a number of other EPSPS enzymes, 
similar changes in kinetics were observed, an elevation of the Ki for glyphosate 
andoftheKaforPEP. 



Table I Kinetic characterization of EPSPS enzvmes 



IQaPEP 

(uM) 



Ki Glyphosate 
(uM) 



5 


0.4 


0.08 


200 


2000 


10 


2.1-3.U 


25-82 


-8-40 


-7.3-82 


60 (est)? 


-7.9 


123 


2720 


227 


134 


440 


33.8 


55 


200 


40 



ENZYME 
SOURCE 

Petunia 
Petunia GA101 

PG2982 

LBAA 

CP4 

B. subtilis 1A 
S. aureus 



1 Range of PEP tested = 1-40 uM 

2 Range of PEP tested = 5-80 |iM 

3 Range of PEP tested = 1.5-40 pM 

4 Range of PEP tested = 1-60 uM 

5 Range of PEP tested = 1-50 jiM 
7 (est) = estimated 

The Agrobacterium sp. strain CP4 was initially identified by its ability to 
grow on glyphosate as a carbon source (10 mM) in the presence of 1 mM 
phosphate. The strain CP4 was identified from a collection obtained from a 
fixed-bed immobilized cell column that employed Mannville R-635 
diatomaceous earth beads. The column had been run for three months on a 
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waste-water feed from a glyphosate production plant. The column contained 50 
mg/ml glyphosate and NH3 as NH4CI. Total organic carbon was 300 mg/ml and 
BOD's (Biological Oxygen Demand - a measure of "soft" carbon availability) 
were less than 30 mg/ml. This treatment column has been described 
(Heitkamp et al., 1990). Dworkin-Foster minimal salts medium containing 
glyphosate at 10 mM and with phosphate at 1 mM was used to select for 
microbes from a wash of this column that were capable of growing on 
glyphosate as sole carbon source. Dworkin-Foster minimal medium was made 
up by combining in 1 liter (with autoclaved H 2 0), 1 ml each of A, B and C and 
10 ml of D (as per below) and thiamine HC1 (5 mg). 



A. D-F Salts (1000X stock; per 100 ml; autoclaved): 
H3BO3 1 mg 

MnS0 4 .7H 2 0 1 mg 
ZnS0 4 .7H 2 0 12.5 mg 
CuS0 4 .5H 2 0 8 mg 
NaMo0 3 .3H 2 0 1.7 mg 



B. FeSO 4 .7H 2 0 (1000X stock: per 100 ml; autoclaved) 0.1 g 

C. MgS0 4 .7H 2 0 (1000X stock; per 100 ml; autoclaved) 20 g 

D. (NH 4 ) 2 S0 4 ( 100X stock; per 100 ml; autoclaved) 20 g 

Yeast Extract (YE; Difco) was added to a final concentration of 0.01 or 
0.001%. The strain CP4 was also grown on media composed of D-F salts, 
amended as described above, containing glucose, gluconate and citrate (each at 
0.1 %) as carbon sources and with inorganic phosphate (0.2 - 1.0 mM) as the 
phosphorous source. 

Other Class II EPSPS containing microorganisms were identified as 
Achromobacter sp. strain LBAA (Hallas et aL, 1988), Pseudomonas sp. strain 
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PG2982 (Moore et aL, 1983; Fitzgibbon 1988), Bacillus subtilis 1A2 (Henner et 
aL, 1984) and Staphylococcus aureus (O'Connell et aL, 1993). It had been 
reported previously, from measurements in crude lysates, that the EPSPS 
enzyme from strain PG2982 was less sensitive to inhibition to glyphosate than 
that of E. coli, but there has been no report of the details of this lack of 
sensitivity and these has been no report on the K m for PEP for this enzyme or 
of the DNA sequence for the gene for this enzyme (Fitzgibbon, 1988; 
Fitzgibbon and Braymer, 1990). 

Relationship of the Class II EPSPS to those previously studied 

All EPSPS proteins studied to date have shown a remarkable degree of 
homology. For example, bacterial and plant EPSPS's are about 54% identical 
and with similarity as high as 80%. Within bacterial EPSPS's and plant 
EPSPS's themselves the degree of identity and similarity is much greater (see 
Table II). 

labfeil Comparison between exemplary Class I EPSPS 

p rotein sequences! 

similarity identity 

E. coli vs. S. typhimurium 93 88 

P. hybrida vs'.E. coli 72 55 

P. hybrida vs. L. esculentum 93 88 

1 The EPSPS sequences compared here were obtained from the following references: E. 
coli. Rogers et aL, 1983; S. typhimurium, Stalker et al., 1985; Petunia hybrida. Shah et al., 
1986; and tomato (L. esculentum), Gasser et al, 1988. 

When crude extracts of CP4 and LBAA bacteria (50 ug protein) were 
probed using rabbit anti-EPSPS antibody (Padgette et aL, 1987) to the Petunia 
EPSPS protein in a Western analysis, no positive signal could be detected, 
even with extended exposure times (Protein A - 1251 development system) and 
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under conditions where the control EPSPS (Petunia EPSPS, 20 ng; a Class I 
EPSPS) was readily detected. The presence of EPSPS activity in these 
extracts was confirmed by enzyme assay. This surprising result, indicating a 
lack of similarity between the EPSPS's from these bacterial isolates and those 
previously studied, coupled with the combination of a low Km for PEP and a 
high Kj for glyphosate, illustrates that these new EPSPS enzymes are different 
from known EPSPS enzymes (now referred to as Class I EPSPS). 



GIvphosate-tolerant Enzymes in Microbial Isolates 

For clarity and brevity of disclosure, the following description of the 
isolation of genes encoding Class II EPSPS enzymes is directed to the isolation 
of such a gene from a bacterial isolate. Those skilled in the art will recognize 
that the same or similar strategy can be utilized to isolate such genes from 
other microbial isolates, plant or fungal sources. 

Cloning of the A&robacterium so. strain CP4 EPSPS Gene(s) in g. CPU 

Having established the existence of a suitable EPSPS in Agrobacterium 
sp. strain CP4, two parallel approaches were undertaken to clone the gene: 
cloning based on the expected phenotype for a giyphosate-tolerant EPSPS; and 
purification of the enzyme to provide material to raise antibodies and to obtain 
amino acid sequences from the protein to facilitate the verification of clones. 
Cloning and genetic techniques, unless otherwise indicated, are generally those 
described in Maniatis et al. t 1982 or Sambrook et aL, 1987. The cloning 
strategy was as follows: introduction of a cosmid bank of strain Agrobacterium 
sp. strain CP4 into E. coli and selection for the EPSPS gene by selection for 
growth on inhibitory concentrations of glyphosate. 

Chromosomal DNA was prepared from strain Agrobacterium sp. strain 
CP4 as follows: The cell pellet from a 200 ml L-Broth (Miller, 1972), late log 
phase culture of Agrobacterium sp. strain CP4 was resuspended in 10 ml of 
Solution I; 50 miM Glucose, 10 mM EDTA, 25 mM Tris -CL pH 8.0 (Birnboim 
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and Doly, 1979). SDS was added to a final concentration of 1% and the 
suspension was subjected to three freeze-thaw cycles, each consisting of 
immersion in dry ice for 15 minutes and in water at 70°C for 10 minutes. The 
lysate was then extracted four times with equal volumes of phenohchloroform 
(1:1; phenol saturated with TE; TE = 10 mM Tris pH8.0; 1.0 mM EDTA) and 
the phases separated by centrifugation (15000g; 10 minutes). The 
ethanol-precipitable material was pelleted from the supernatant by brief 
centrifugation (8000g; 5 minutes) following addition of two volumes of ethanol. 
The pellet was resuspended in 5 ml TE and dialyzed for 16 hours at 4°C against 
2 liters TE. This preparation yielded a 5 ml DNA solution of 552 ng/ml. 

Partially-restricted DNA was prepared as follows. Three 100 ug aliquot 
samples of CP4 DNA were treated for 1 hour at 37°C with restriction 
endonuclease HindlTl at rates of 4, 2 and 1 enzyme unit/jig DNA, respectively. 
The DNA samples were pooled, made 0.25 mM with EDTA and extracted with 
an equal volume of phenolxhloroform. Following the addition of sodium acetate 
and ethanol, the DNA was precipitated with two volumes of ethanol and 
pelleted by centrifugation (12000 g; 10 minutes). The dried DNA pellet was 
resuspended in 500 ul TE and layered on a 10-40% Sucrose gradient (in 5% 
increments of 5.5 ml each) in 0.5 M NaCl, 50 mM Tris pH8.0, 5 mM EDTA. 
Following centrifugation for 20 hours at 26,000 rpm in a SW28 rotor, the tubes 
were punctured and -1.5 ml fractions collected. Samples (20 ul) of each second 
fraction were run on 0.7% agarose gel and the size of the DNA determined by 
comparison with linearized lambda DNA and HmdIII-digested lambda DNA 
standards. Fractions containing DNA of 25-35 kb fragments were pooled, 
desalted on AMICON10 columns (7000 rpm; 20°C; 45 minutes) and 
concentrated by precipitation. This procedure yielded 15 ug of CP4 DNA of the 
required size. A cosmid bank was constructed using the vector pMON17020. 
This vector, a map of which is presented in Figure 2, is based on the pBR327 
replicon and contains the spectinomycin/streptomycin (Spr;spc) resistance 
gene from Tn7 (Fling et al., 1985), the chloramphenicol resistance gene 
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(Cmr;cat) from Tn9 (Alton et al., 1979), the genelO promoter region from phage 
T7 (Dunn et al., 1983), and the 1.6 kb BglU phage lambda cos fragment from 
pHC79 (Hohn and Collins, 1980). A number of cloning sites are located 
downstream of the cat gene. Since the predominant block to the expression of 
genes from other microbial sources in E. coli appears to be at the level of 
transcription, the use of the T7 promoter and supplying the T7 polymerase in 
trans from the p(xPl-2 plasmid (Tabor and Richardson, 1985), enables the 
expression of large DNA segments of foreign DNA, even those containing RNA 
polymerase transcription termination sequences. The expression of the spc 
gene is impaired by transcription from the T7 promoter such that only Cm'" 
can be selected in strains containing pGPl-2. The use of antibiotic resistances 
such as Cm resistance which do not employ a membrane component is 
preferred due to the observation that high level expression of resistance genes 
that involve a membrane component, i.e. fi-lactamase and Amp resistance, 
give rise to a glyphosate-tolerant phenotype. Presumably, this is due to the 
exclusion of glyphosate from the cell by the membrane localized resistance 
protein. It is also preferred that the selectable marker be oriented in the same 
direction as the T7 promoter. 

The vector was then cut with Hindlll and treated with calf alkaline 
phosphatase (CAP) in preparation for cloning. Vector and target sequences 
were ligated by combining the following: 

Vector DNA (HindlWCAP) 3 ug 

Size fractionated CP4 HmdIII fragments 1.5 ug 

10X ligation buffer 2 - 2 ^ 

T4 DNA ligase ( New England Biolabs) (400 U/ul) 1.0 ul 

and adding H 2 0 to 22.0 ul. This mixture was incubated for 18 hours at 16°C. 
10X ligation buffer is 250 mM Tris-HCl, pH 8.0; 100 mM MgCl 2 ; 100 mM 
Dithiothreitol; 2 mM Spermidine. The ligated DNA (5 ul) was packaged into 
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lambda phage particles (Stratagene; Gigapack Gold) using the manufacturer's 
procedure. 

A sample (200 ul) of E. coli HB101 (Boyer and Rolland-Dussoix, 1973) 
containing the T7 polymerase expression plasmid pGPl-2 (Tabor and 
Richardson, 1985) and grown overnight in L-Broth (with maltose at 0.2% and 
kanamycin at 50 ug/ml) was infected with 50 ul of the packaged DNA. 
Transformants were selected at 30°C on M9 (Miller, 1972) agar containing 
kanamycin (50 ug/ml), chloramphenicol (25 ug/ml), L-proline (50 ug/ml), L- 
leucine (50 ug/ml) and Bl (5 ug/ml), and with glyphosate at 3.0 mM. Aliquot 
samples were also plated on the same media lacking glyphosate to titer the 
packaged cosmids. Cosmid transformants were isolated on this latter medium 
at a rate of ~5 x 105 per ug CP4 Hindlll DNA after 3 days at 30°C. Colonies 
arose on the glyphosate agar from day 3 until day 15 with a final rate of ~1 per 
200 cosmids. DNA was prepared from 14 glyphosate-tolerant clones and, 
following verification of this phenotype, was transformed into E. coli 
GB100/pGPl-2 CE. coli GB100 is an aroA derivative of MM294 [Talmadge and 
Gilbert, 1980]) and tested for complementation for growth in the absence of 
added aromatic amino acids and aminobenzoic acids. Other aroA strains such 
as SR481 (Bachman et al., 1980; Padgette et al., 1987), could be used and 
would be suitable for this experiment. The use of GB100 is merely exemplary 
and should not be viewed in a limiting sense. This aroA strain usually requires 
that growth media be supplemented with L-phenylalanine, L-tyrosine and 
L-tryptophan each at 100 ug/ml and with para-hydroxybenzoic acid, 
2,3-dihydroxybenzoic acid and para-aminobenzoic acid each at 5 ug/ml for 
growth in minimal media. Of the fourteen cosmids tested only one showed 
complementation of the aroA- phenotype. Transformants of this cosmid, 
pMON17076, showed weak but uniform growth on the unsupplemented 
minimal media after 10 days. 

The proteins encoded by the cosmids were determined in vivo using a T7 
expression system (Tabor and Richardson, 1985). Cultures of E. coli containing 
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pGPl-2 (Tabor and Richardson, 1985) and test and control cosmids were grown 
at 30°C in L-broth (2 ml) with chloramphenicol and kanamycin (25 and 50 
jig/ml, respectively) to a Klett reading of ~ 50. An aliquot was removed and the 
cells collected by centrifugation, washed with M9 salts (Miller, 1972) and 
resuspended in 1 ml M9 medium containing glucose at 0.2%, thiamine at 20 
ug/ml and containing the 18 amino acids at 0.01% (minus cysteine and 
methionine). Following incubation at 30°C for 90 minutes, the cultures were 
transferred to a 42°C water bath and held there for 15 minutes. Rifampicin 
(Sigma) was added to 200 ug/ml and the cultures held at 42°C for 10 additional 
minutes and then transferred to 30°C for 20 minutes. Samples were pulsed 
with 10 uCi of 35S-methionine for 5 minutes at 30°C. The cells were collected 
by centrifugation and suspended in 60-120 ul cracking buffer (60 mM Tris-HCl 
6.8, 1% SDS, 1% 2-mercaptoethanol, 10% glycerol, 0.01% bromophenol blue). 
Aliquot samples were electrophoresed on 12.5% SDS-PAGE and following 
soaking for 60 minutes in 10 volumes of Acetic Acid-Methanol-water 
(10:30:60), the gel was soaked in ENLIGHTNING ™ (DUPONT) following 
manufacturer s directions, dried, and exposed at -70°C to X-Ray film. Proteins 
of about 45 kd in size, labeled with 35S-methionine, were detected in number of 
the cosmids. including pMON 17076. 

Purification n f EPSPS from Aerobacterium qp t strain CP4 

All protein purification procedures were carried out at 3-5°C. EPSPS 
enzyme assays were performed using either the phosphate release or 
radioactive HPLC method, as previously described in Padgette et al., 1987, 
using 1 mM phosphoenol pyruvate (PEP, Boehringer) and 2 mM shikimate-3- 
phosphate (S3P) substrate concentrations. For radioactive HPLC assays, 14C- 
PEP (Amersham) was utilized. S3P was synthesized as previously described in 
Wibbenmeyer et al. 1988. N-terminal amino acid sequencing was performed by 
loading samples onto a Polybrene precycled filter in aliquots while drying. 
Automated Edman degradation chemistry was used to determine the N- 
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terminal protein sequence, using an Applied Biosystems Model 470A gas phase 
sequencer (Hunkapiller et al., 1983) with an Applied Biosystems 120A PTH 
analyzer. 

Five 10-litre fermentations were carried out on a spontaneous "smooth" 
isolate of strain CP4 that displayed less clumping when grown in liquid culture. 
This reduced clumping and smooth colony morphology may be due to reduced 
polysaccharide production by this isolate. In the following section dealing with 
the purification of the EPSPS enzyme, CP4 refers to the "smooth" isolate - 
CP4-S1. The cells from the three batches showing the highest specific 
activities were pooled. Cell paste of Agrobacterium sp. CP4 (300 g) was washed 
twice with 0.5 L of 0.9% saline and collected by centrifugation (30 minutes, 
8000 rpm in a GS3 Sorvall rotor). The cell pellet was suspended in 0.9 L 
extraction buffer (100 mM TrisCl, 1 mM EDTA, 1 mM BAM (Benzamidine), 5 
mM DTT, 10% glycerol, pH 7.5) and lysed by 2 passes through a Manton 
Gaulin cell. The resulting solution was centrifuged (30 minutes, 8000 rpm) and 
the supernatant was treated with 0.21 L of 1.5% protamine sulfate (in 100 
mM TrisCl, pH 7.5, 0.2% w/v final protamine sulfate concentration). After 
stirring for 1 hour, the mixture was centrifuged (50 minutes, 8000 rpm) and the 
resulting supernatant treated with solid ammonium sulfate to 40% saturation 
and stirred for 1 hour. After centrifugation (50 minutes, 8000 rpm), the 
resulting supernatant was treated with solid ammonium sulfate to 70% 
saturation, stirred for 50 minutes, and the insoluble protein was collected by 
centrifugation (1 hour, 8000 rpm). This 40-70% ammonium sulfate fraction 
was then dissolved in extraction buffer to give a final volume of 0.2 L, and 
dialyzed twice (Spectrum 10,000 MW cutoff dialysis tubing) against 2 L of 
extraction buffer for a total of 12 hours. 

To the resulting dialyzed 40-70% ammonium sulfate fraction i0.29 L) 
was added solid ammonium sulfate to give a final concentration of 1 M. This 
material was loaded (2 ml/min) onto a column (5 cm x 15 cm, 295 ml) packed 
with phenyl Sepharose CL-4B (Pharmacia) resin equilibrated with extraction 
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buffer containing 1 M ammonium sulfate, and washed with the same buffer 
(1.5 L, 2 ml/min). EPSPS was eluted with a linear gradient of extraction buffer 
going from 1 M to 0.00 M ammonium sulfate (total volume of 1.5 L, 2 ml/min). 
Fractions were collected (20 ml) and assayed for EPSPS activity by the 
phosphate release assay. The fractions with the highest EPSPS activity 
(fractions 36-50) were pooled and dialyzed against 3 x 2 L (18 hours) of 10 mM 
TrisCl, 25 mM KCT, 1 mM EDTA, 5 mM DTT, 10% glycerol, pH 7.8. 

The dialyzed EPSPS extract (350 ml) was loaded (5 ml/min) onto a 
column (2.4 cm x 30 cm, 136 ml) packed with Q-Sepharose Fast Flow 
(Pharmacia) resin equilibrated with 10 mM TrisCl, 25 mM KC1, 5 mM DTT, 
10% glycerol, pH 7.8 (Q Sepharose buffer), and washed with 1 L of the same 
buffer. EPSPS was eluted with a linear gradient of Q Sepharose buffer going 
from 0.025 M to 0.40 M KC1 (total volume of 1.4 L, 5 ml/min). Fractions were 
collected (15 ml) and assayed for EPSPS activity by the phosphate release 
assay. The fractions with the highest EPSPS activity (fractions 47-60) were 
pooled and the protein was precipitated by adding solid ammonium sulfate to 
80% saturation and stirring for 1 hour. The precipitated protein was collected 
by centrifugation (20 minutes, 12000 rpm in a GSA Sorvall rotor), dissolved in 
Q Sepharose buffer (total volume of 14 ml), and dialyzed against the same 
buffer (2 x 1 L, 18 hours). 

The resulting dialyzed partially purified EPSPS extract (19 ml) was 
loaded (1.7 ml/min) onto a Mono Q 10/10 column (Pharmacia) equilibrated with 
Q Sepharose buffer, and washed with the same buffer (35 ml). EPSPS was 
eluted with a linear gradient of 0.025 M to 0.35 M KC1 (total volume of 119 ml, 
1.7 ml/min). Fractions were collected (1.7 ml) and assayed for EPSPS activity 
by the phosphate release assay. The fractions with the highest EPSPS 
activity (fractions 30-37) were pooled (6 ml). 

The Mono Q pool was made 1 M in ammonium sulfate by the addition of 
solid ammonium sulfate and 2 ml aliquots were chromatographed on a Phenyl 
Superose 5/5 column (Pharmacia) equilibrated with 100 mM TrisCl, 5 mM 
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DTT, 1 M ammonium sulfate, 10% glycerol, pH 7.5 (Phenyl Superose buffer). 
Samples were loaded (1 ml/min), washed with Phenyl Superose buffer (10 ml), 
and eluted with a linear gradient of Phenyl Superose buffer going from 1 M to 
0.00 M ammonium sulfate (total volume of 60 ml, 1 ml/min). Fractions were 
collected (1 ml) and assayed for EPSPS activity by the phosphate release 
assay. The fractions from each run with the highest EPSPS activity (fractions 
~36-40) were Poole's together (10 ml, 2.5 mg protein). For N-terminal amino 
acid sequence determination, a portion of one fraction (#39 from run 1) was 
dialyzed against 50 mM NaHCC>3 (2x1 L). The resulting pure EPSPS sample 
(0.9 ml, 77 \ig protein) was found to exhibit a single N-terminal amino acid 
sequence of: 

XH(G)ASSRPATARKSS(G)LX(G)(T)V(R)IPG(D)(K)(M) (SEQ ID NO:18). 

The remaining Phenyl Superose EPSPS pool was dialyzed against 50 
mM TrisCl, 2 mM DTT, 10 mM KC1, 10% glycerol, pH 7.5 (2x1 L). An aliquot 
(0.55 ml, 0.61 mg protein) was loaded (1 ml/min) onto a Mono Q 5/5 column 
(Pharmacia) equilibrated with Q Sepharose buffer, washed with the same 
buffer (5 ml), and eluted with a linear gradient of Q Sepharose buffer going from 
0-0.14 M KC1 in 10 minutes, then holding at 0.14 M KC1 (1 ml/min). Fractions 
were collected (1 ml) and assayed for EPSPS activity by the phosphate release 
assay and were subjected to SDS-PAGE (10-15%, Phast System, Pharmacia, 
with silver staining) to determine protein purity. Fractions exhibiting a single 
band of protein by SDS-PAGE (22-25, 222 ug) were pooled and dialyzed against 
100 mM ammonium bicarbonate, pH 8.1 (2 x 1 L t 9 hours). 

Twnsmolvsis and peptide sequencing < }f Affrobacterium SP Strain CP4 
EPSPS 

To the resulting pure Agrobacterium sp. strain CP4 EPSPS (111 \ig) 
was added 3 ug of trypsin (Calbiochem), ami the trypsinolysis reaction was 
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allowed to proceed for 16 hours at 37°C. The tryptic digest was then 
chromatographed (lml/min) on a C 18 reverse phase HPLC column (Vydac) as 
previously described in Padgette et al., 1988 for E. coli EPSPS. For all peptide 
purifications, 0.1% trifluoroacetic acid (TFA, Pierce) was designated buffer "RP- 
-A" and 0.1% TFA in acetonitrile was buffer "RP-B". The gradient used for 
elution of the trypsinized Agrobacterium sp. CP4 EPSPS was: 0-8 minutes, 0% 
RP-B; 8-28 minutes, 0-15% RP-B; 28-40 minutes, 15-21% RP-B; 40-68 
minutes, 21-49% RP-B; 68-72 minutes, 49-75% RP-B; 72-74 minutes, 75- 
100% RP-B. Fractions were collected (1 ml) and, based on the elution profile at 
210 nm, at least 70 distinct peptides were produced from the trypsinized 
EPSPS. Fractions 40-70 were evaporated to dryness and redissolved in 150 fxl 
each of 10% acetonitrile, 0.1% trifluoroacetic acid. 

The fraction 61 peptide was further purified on the C18 column by the 
gradient: 0-5 minutes, 0% RP-B; 5-10 minutes, 0-38% RP-B; 10-30 minutes, 
38-45% B. Fractions were collected based on the UV signal at 210 nm. A large 
peptide peak in fraction 24 eluted at 42% RP-B and was dried down, 
resuspended as described above, and rechromatographed on the C18 column 
with the gradient: 0-5 minutes, 0% RP-B; 5-12 min, 0-38% RP-B; 12-15 min, 
38-39% RP-B; 15-18 minutes, 39% RP-B; 18-20 minutes. 39-41% RP-B; 20-24 
minutes, 41% RP-B; 24-28 minutes, 42% RP-B. The peptide in fraction 25, 
eluting at 41% RP-B and designated peptide 61-24-25, was subjected to N- 
terminal amino acid sequencing, and the following sequence was determined: 

APSM(IXD)EYPILAV(SEQ ID NO: 19) 

The CP4 EPSPS fraction 53 tryptic peptide was further purified by C18 HPLC 
by the gradient 0% B (5 minutes), 0-30% B (5-17 minutes), 30-40% B (17-37 
minutes). The peptide in fraction 28, eluting at 34% B and designated peptide 
53-28, was subjected to N-terminal amino acid sequencing, and the following 
sequence was determined: 
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ITGLLEGEDVINTGK (SEQ ID NO:20). 

In order to verify the CP4 EPSPS cosmid clone, a number of 
oligonucleotide probes were designed on the basis of the sequence of two of the 
tryptic sequences from the CP4 enzyme (Table III). The probe identified as 
MID was very low degeneracy and was used for initial screening. The probes 
identified as EDV-C and EDV-T were based on the same amino acid sequences 
and differ in one position (underlined in Table III below) and were used as 
confirmatory probes, with a positive to be expected only from one of these two 
probes. In the oligonucleotides below, alternate acceptable nucleotides at a 
particular position are designated by a T such as A/C/T. 

Iflfrle TTT Selected CP 4 EPSPS peptide sequences and DNA probes 

PEPTIDE 61-24-25 APSM(I)(D)EYPILAV (SEQ ID NO:19) 

Probe MID; 17-mer; mixed probe; 24-fold degenerate 
ATGATA/C/TGAC/TGAG/ATAC/TCC (SEQ ID NO:21) 

PEPTIDE 53-28 ITGLLEGEDVINTGK (SEQ ID NO:20) 

Probe EDV-C; 17-mer; mixed probe; 48-fold degenerate 
GAA/GGAC/TGTA/C/G/TATA/C/TAAQAC (SEQ ID NO:22) 
Probe EDV-T; 17-mer; mixed probe; 48-fold degenerate 
GAA/GGAC/TGTA/C/G/TATA/C/TAA1AC (SEQ ID NO:23) 

The probes were labeled using gamma-32P-ATP and polynucleotide 
kinase. DNA from fourteen of the cosmids described above was restricted with 
£coRI, transferred to membrane and probed with the oligonucleotide probes. 
The conditions used were as follows: prehybridization was carried out in 6X 
SSC, lOXDenhardt's for 2-18 hour periods at 60°C, and hybridization was for 
48-72 hours in 6X SSC. 10X Denhardt's, 100 ug/ml tRNA at 10°C below the T d 
for the probe. The T d of the probe was approximated by the formula 2°C x 
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(A+T) + 4°C x (G+C). The filters were then washed three times with 6X SSC for 
ten minutes each at room temperature, dried and autoradiographed. Using the 
MID probe, an -9.9 kb fragment in the pMON17076 cosmid gave the only 
positive signal. This cosmid DNA was then probed with the EDV-C (SEQ ID 
NO:22) and EDV-T (SEQ ID NO:23) probes separately and again this -9.9 kb 
band gave a signal and only with the EDV-T probe. 

The combined data on the glyphosate-tolerant phenotype, the 
complementation of the E. coli aroA- phenotype, the expression of a -45 Kd 
protein, and the hybridization to two probes derived from the CP4 EPSPS 
amino acid sequence strongly suggested that the pMONl7076 cosmid 
contained the EPSPS gene. 



Localization and subcloning of the CP4 EPSPS gene 

The CP4 EPSPS gene was further localized as follows: a number of 
additional Southern analyses were carried out on different restriction digests of 
pMON17076 using the MID (SEQ ID NO:21) and EDV-T (SEQ ID NO:23) 
probes separately. Based on these analyses and on subsequent detailed 
restriction mapping of the pBlueScript (Stratagene) subclones of the -9.9 kb 
fragment from pMON17076,a 3.8 kb EcoRl-Sall fragment was identified to 
which both probes hybridized. This analysis also showed that MID (SEQ ID 
NO:21) and EDV-T (SEQ ID NO:23) probes hybridized to different sides of 
BamHI, Clal, and SacII sites. This 3.8 kb fragment was cloned in both 
orientations in pBlueScript to form pMON17081 and pMON17082. The 
phenotypes imparted to E. coli by these clones were then determined. 
Glyphosate tolerance was determined following transformation into E. coli 
MM294 containing pGPl-2 (pBlueScript also contains a T7 promoter) on M9 
agar media containing glyphosate at 3 mM. Both pMON17081 and 
pMON 17082 showed glyphosate-tolerant colonies at three days at 30°C at 
about half the size of the controls on the same media lacking glyphosate. This 
result suggested that the 3.8 kb fragment contained an intact EPSPS gene. 
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The apparent lack of orientation-dependence of this phenotype could be 
explained by the presence of the T7 promoter at one side of the cloning sites 
and the lac promoter at the other. The aroA phenotype was determined in 
transformants of E. coli GB100 on M9 agar media lacking aromatic 
supplements. In this experiment, carried out with and without the Plac 
inducer IPTG, pMON17082 showed much greater growth than pMON17081, 
suggesting that the EPSPS gene was expressed from the Sail site towards the 
EcoRl site. 

Nucleotide sequencing was begun from a number of restriction site ends, 
including the BamBl site discussed above. Sequences encoding protein 
sequences that closely matched the N-terminus protein sequence and that for 
the tryptic fragment 53-28 (SEQ ID NO:20) (the basis of the EDV-T probe) 
(SEQ ID NO:23) were localized to the Sail side of this BamHl site. These 
data provided conclusive evidence for the cloning of the CP4 EPSPS gene and 
for the direction of transcription of this gene. These data coupled with the 
restriction mapping data also indicated that the complete gene was located 
on an -2.3 kb Xhol fragment and this fragment was subcloned into 
pBlueScript. The nucleotide sequence of almost 2 kb of this fragment was 
determined by a combination of sequencing from cloned restriction fragments 
and by the use of specific primers to extend the sequence. The nucleotide 
sequence of the CP4 EPSPS gene and flanking regions is shown in Figure 3 
(SEQ ID NO:2). The sequence corresponding to peptide 61-24-25 (SEQ ID 
NO:19) was also located. The sequence was determined using both the 
SEQUENASE™ kit from IBI (International Biotechnologies Inc.) and the T7 
sequencing/Deaza Kit from Pharmacia. 

That the cloned gene encoded the EPSPS activity purified from the 
Agrobacterium sp. strain CP4 was verified in the following manner: By a series 
of site directed mutageneses, Bglll and iVcol sites were placed at the 
N-terminus with the fMet contained within the Ncol recognition sequence, the 
first internal xVcol site was removed (the-second internal Ncol site was 
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removed later), and a Sad site was placed after the stop codons. At a later 
stage the internal Notl site was also removed by site-directed mutagenesis. 
The following list includes the primers for the site-directed mutagenesis 
(addition or removal of restriction sites) of the CP4 EPSPS gene. 
Mutagenesis was carried out by the procedures of, Kunkel et al. (1987), 
essentially as described in Sambrook et al. (1989). 

PRIMER BgNc (addition officii and Ncol sites to N-terminus) 

CGTGGATAGATCTAGGAAGACAACCATGGCTCACGGTC 

(SEQIDNO:24) 

PRIMER Snh2 (addition ofSphl site to N-terminus) 
GGATAGATTAAGGAAGACGCGCATGCTTCACGGTGCAAGCAGCC 
(SEQ ID NO:25) 

PRIMER SI (addition of Sad site immediately after stop codons) 
GGCTGCCTGATGAGCTCCACAATCGCCATCGATGG 
(SEQ ID NO:26) 

PRTMER Nl (removal of internal Notl recognition site) 
CGTCGCTCGTCGTGCGTGGCCGCCCTGACGGC 
(SEQ ID NO:27) 

PRTMER Ncol (removal of first internal Ncol recognition site) 

CGGGCAAGGCCATGCAGGCTATGGGCGCC 

(SEQ ID NO-.28) 



PRTMER Nco2 (removal of second internal Ncol recognition 

CGGGCTGCCGCCTGACTATGGGCCTCGTCGG 

(SEQ ID NO:29) 
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This CP4 EPSPS gene was then cloned as a Ncol-BamEI N-terminal 
fragment plus a Bamill-Sacl C-terminal fragment into a PrecA-genelOL 
expression vector similar to those described (Wong et aL, 1988; Olins et al., 
1988) to form pMON17101. The K m for PEP and the Kj for glyphosate were 
determined for the EPSPS activity in crude lysates of pMON17101/GB100 
transformants following induction with nalidixic acid (Wong et al., 1988) and 
found to be the same as that determined for the purified and crude enzyme 
preparations from Agrobacterium sp. strain CP4. 

Characterization of the EPSPS gene from Achromobacter sp. strain 
TJSAA and from Pseudomonas so. strain PG2982 

A cosmid bank of partially Hi ra dill-restricted LBAA DNA was 
constructed in E. coli MM294 in the vector pHC79 (Hohn and Collins, 1980). 
This bank was probed with a full length CP4 EPSPS gene probe by colony 
hybridization and positive clones were identified at a rate of ~1 per 400 
cosmids. The LBAA EPSPS gene was further localized in these cosmids by 
Southern analysis. The gene was located on an -2.8 kb Xhol fragment and by 
a series of sequencing steps, both from restriction fragment ends and by using 
the oligonucleotide primers from the sequencing of the CP4 EPSPS gene, the 
nucleotide sequence of the LBAA EPSPS gene was completed and is presented 
in Figure 4 (SEQ ID NO:4). 

The EPSPS gene from PG2982 was also cloned. The EPSPS protein was 
purified, essentially as described for the CP4 enzyme, with the following 
differences: Following the Sepharose CL-4B column, the fractions with the 
highest EPSPS activity were pooled and the protein precipitated by adding 
solid ammonium sulfate to 85% saturation and stirring for 1 hour. The 
precipitated protein was collected by centrifugation, resuspended in Q 
Sepharose buffer and following dialysis against the same buffer was loaded 
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onto the column (as for the CP4 enzyme). After purification on the Q 
Sepharose column, -40 mg of protein in 100 mM Tris pH 7.8, 10% glycerol, 
1 mM EDTA, 1 mM DTT, and 1 M ammonium sulfate, was loaded onto a 
Phenyl Superose (Pharmacia) column. The column was eluted at 1.0 
ml/minutes with a 40 ml gradient from 1.0 M to 0.00 M ammonium sulfate in 
the above buffer. 

Approximately 1.0 mg of protein from the active fractions of the Phenyl 
Superose 10/10 column was loaded onto a Pharmacia Mono P 5/10 
Chromatofocusing column with a flow rate of 0.75 ml/minutes. The starting 
buffer was 25 mM bis-Tris at pH 6.3, and the column was eluted with 39 ml of 
Polybuffer 74, pH 4.0. Approximately 50 ug of the peak fraction from the 
Chromatofocusing column was dialyzed into 25 mM ammonium bicarbonate. 
This sample was then used to determine the N-terminal amino acid sequence. 

The N-terminal sequence obtained was: 

XHSASPKPATARRSE (where X= an unidentified residue) 
(SEQ ID NO:30) 

A number of degenerate oligonucleotide probes were designed based on 
this sequence and used to probe a library of PG2982 partial-Hmcflll DNA in 
the cosmid pHC79 (Hohn and Collins, 1980) by colony hybridization under 
nonstringent conditions. Final washing conditions were 15 minutes with IX 
SSC, 0.1% SDS at 55°C. One probe with the sequence 
GCGGTBGCSGGYTTSGG (where B = C, G, or T; S = C or G, and Y = C or T) 
(SEQ ID NO:31) identified a set of cosmid clones. 

The cosmid set identified in this way was made up of cosmids of diverse 
HindHS. fragments. However, when this set was probed with the CP4 EPSPS 
gene probe, a cosmid containing the PG2982 EPSPS gene was identified 
(designated as cosmid 9C1 originally and later as pMON20107). By a series of 
restriction mappings and Southern analysis this gene was localized to a -2.8 
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kb Xhol fragment and the nucleotide sequence of this gene was determined. 
This DNA sequence (SEQ ID NO:6) is shown in Figure 5. There are no 
nucleotide differences between the EPSPS gene sequences from LBAA (SEQ 
ID NO:4) and PG2982 (SEQ ID NO:6). The kinetic parameters of the two 
enzymes are within the range of experimental error. 

A gene from PG2982 that imparts glyphosate tolerance in E. coli has 
been sequenced tFitzgibbon, 1988; Fitzgibbon and Braymer, 1990). The 
sequence of the PG2982 EPSPS Class II gene shows no homology to the 
previously reported sequence suggesting that the glyphosate-tolerant 
phenotype of the previous work is not related to EPSPS. 

nWacteri gation of the EPSPS from Bacillus subtilis 

Bacillus subtilis 1A2 (prototroph) was obtained from the Bacillus 
Genetic Stock Center at Ohio State University. Standard EPSPS assay 
reactions contained crude bacterial extract with, 1 mM phosphoenolpyruvate 
(PEP), 2 mM shikimate-3-phosphate (S3P), 0.1 mM ammonium molybdate, 5 
mM potassium fluoride, and 50 mM HEPES, pH 7.0 at 25°C. One unit (U) of 
EPSPS activity is defined as one jimol EPSP formed per minute under these 
conditions. For kinetic determinations, reactions contained crude bacterial, 2 
mM S3P, varying concentrations of PEP, and 50 mM HEPES, pH 7.0 at 25°C. 
The EPSPS specific activity was found to be 0.003 U/mg. When the assays 
were performed in the presence of 1 mM glyphosate, 100% of the EPSPS 
activity was retained. The appK m (PEP) of the B. subtilis EPSPS was 
determined by measuring the reaction velocity at varying concentrations of 
PEP. The results were analyzed graphically by the hyperbolic, Lineweaver- 
Burk and Eadie-Hofstee plots, which yielded appK m (PEP) values of 15.3 uM, 
10.8 uM and 12.2 pM, respectively. These three data treatments are in good 
agreement, and yield an average value for appK m (PEP) of 13 \lM. The 
appKKglyphosate) was estimated by determining the reaction rates of B. 
subtilis 1A2 EPSPS in the presence of several concentrations of glyphosate, at 
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a PEP concentration of 2 uM. These results were compared to the calculated 
V max of the EPSPS, and making the assumption that glyphosate is a 
competitive inhibitor versus PEP for B. subtilis EPSPS, as it is for all other 
characterized EPSPSs, an appKj(glyphosate) was determined graphically. The 
appKi(glyphosate) was found to be 0.44 mM. 

The EPSPS expressed from the B. subtilis aroE gene described by 
Henner et al. (1986) was also studied. The source of the B. subtilis aroE 
(EPSPS) gene was the E. coli plasmid-bearing strain ECE13 (original code = 
MM294[p trplOO]; Henner, et al., 1984; obtained from the Bacillus Genetic 
Stock Center at Ohio State University; the culture genotype is [pBR322 
trplOO] Ap [in MM294] [pBR322::6 kb insert with trpFBA-hisH]). Two 
strategies were taken to express the enzyme in E. coli GB100 (aroA-): 1) the 
gene was isolated by PCR and cloned into an overexpression vector, and 2) the 
gene was subcloned into an overexpression vector. For the PCR cloning of the 
B. subtilis aroE from ECE13, two oligonucleotides were synthesized which 
incorporated two restriction enzyme recognition sites (iVdel and 2?coRI) to the 
sequences of the following oligonucleotides: 

GGAACATATGAAACGAGATAAGGTGCAG (SEQ ID NO:45) 

GGAATTCAAACTTCAGGATCTTGAGATAGAAAATG (SEQ ID NO:46) 

The other approach to the isolation of the B. subtilis aroE gene, subcloning 
from ECE13 into pUC118, was performed as follows: 

(i) Cut ECE13 and pUC with Xmal and Sphl. 

(ii) Isolate 1700bp aroE fragment and 2600bp pUC118 vector fragment. 

(iii) Ligate fragments and transform into GB100. 

The subclone was designated P MON21133 and the PCR-derived clone was 
named P MON21132. Clones from both approaches were first confirmed for 
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complementation of the aroA mutation in E. coli GB100. The cultures 
exhibited EPSPS specific activities of 0.044 U/mg and 0.71 U/mg for the 
subclone (pMON21133) and PCR-derived clone (pMON21132) enzymes, 
respectively. These specific activities reflect the expected types of expression 
levels of the two vectors. The B. subtilis EPSPS was found to be 88% and 
100% resistant te inhibition by 1 mM glyphosate under these conditions for the 
subcloned (pMON21133) and PCR-derived (pMON21132) enzymes, 
respectively. The appKmCPEP) and the appKi(glyphosate) of the subcloned B. 
subtilis EPSPS (pMON21133) were determined as described above. The data 
were analyzed graphically by the same methods used for the 1A2 isolate, and 
the results obtained were comparable to those reported above for jB. subtilis 
1A2 culture. 



Characteriza tion of the EPSPS gene from Staphylococcus aureus 

The kinetic properties of the S. aureus EPSPS expressed in E. coli were 
determined, including the specific activity, the appK m (PEP), and the 
appKi(glyphosate). The S. aureus EPSPS gene has been previously described 
(<? Connelly a/., 1993) 

The strategy taken for the cloning of the S. aureus EPSPS was 
polymerase chain reaction (PGR), utilizing the known nucleotide sequence of 
the S. aureus aroA gene encoding EPSPS (O' Connell et aL, 1993). The S. 
aureus culture (ATCC 35556) was fermented in an M2 facility in three 250 mL 
shake flasks containing 55 mL of TYE (tryptone 5g/L, yeast extract 3 g/L, P H 
6.8). The three flasks were inoculated with 1.5 mL each of a suspension made 
from freeze dried ATCC 35556 S. aureus cells in 90 mL of PBS (phosphate- 
buffered saline) buffer. Flasks were incubated at 30°C for 5 days while shaking 
at 250 rpm. The resulting cells were lysed (boiled in TE [tris/EDTA] buffer for 8 
minutes) and the DNA utilized for PCR reactions. The EPSPS gene was 
amplified using PCR and engineered into an E. coli expression vector as follows: 



-37- 



38-21(10660)A 



(i) two oligonucleotides were synthesized which incorporated two restriction 
enzyme recognition sites (Ncol and Sad) to the sequences of the 
oligonucleotides: 

GGGGCCATGGTAAATGAACAAATCATTG (SEQ ID NO:47) 

GGGGGAGCTCATTATCCCTCATTTTGTAAAAGC (SEQIDNO:48) 

(ii) The purified, PCR-amplified aroA gene from S. aureus was digested using 
Ncol and Sad enzymes. 

(iii) DNA of pMON 5723, which contains a pftecA bacterial promoter and 
GenelO leader sequence (Olins et al., 1988) was digested Ncol and SacI 
and the 3.5 kb digestion product was purified. 

(iv) The S. aureus PCR product and the Ncol / SacI pMON 5723 fragment 
were ligated and transformed into E. coli JM101 competent cells. 

(v) Two spectinomycin-resistant E. coli JM101 clones from above (SA#2 
and SA#3) were purified and transformed into a competent aroA- E. coli 
strain, GB100 

For complementation experiments SAGB#2 and SAGB#3 were utilized, 
which correspond to SA#2 and SA#3, respectively, transformed into E. coli 
GB100. In addition, E. coli GB100 (negative control) and pMON 9563 (wt 
petunia EPSPS, positive control) were tested for AroA complementation. The 
organisms were grown in minimal media plus and minus aromatic amino acids. 
Later analyses showed that the SA#2 and SA#3 clones were identical, and 
they were assigned the plasmid identifier pMON21139. 

SAGB#2 in E. coli GB100 <pMON21139) was also grown in M9 minimal 
media and induced with nalidixic acid. A negative control, E. coli GB100, was 
grown under identical conditions except the media was supplemented with 
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aromatic amino acids. The cells were harvested, washed with 0.9% NaCl, and 
frozen at -80°C, for extraction and EPSPS analysis. 

The frozen pMON21139 E. coli GB100 cell pellet from above was 
extracted and assayed for EPSPS activity as previously described. EPSPS 
assays were performed using 1 mM phosphoenolpyruvate (PEP), 2 mM 
shikimate-3-phosphate (S3P), 0. 1 mM ammonium molybdate, 5 mM 
potassium fluoride, pH 7.0, 25°C. The total assay volume was 50 |iL, which 
contained 10 pJL of the undiluted desalted extract. 

The results indicate that the two clones contain a functional 
aroA/EPSPS gene since they were able to grow in minimal media which 
contained no aromatic amino acids. As expected, the GB100 culture did not 
grow on minimal medium without aromatic amino acids (since no functional 
EPSPS is present), and the pMON9563 did confer growth in minimal media. 
These results demonstrated the successful cloning of a functional EPSPS gene 
from S. aureus. Both clones tested were identical, and the E. coli expression 
vector was designated pMON21139. 

The plasmid pMON21139 in E. coli GB100 was grown in M9 minimal 
media and was induced with nalidixic acid to induce EPSPS expression driven 
from the RecA promoter. A desalted extract of the intracellular protein was 
analyzed for EPSPS activity, yielding an EPSPS specific activity of 0.005 
Hmol/min mg. Under these assay conditions, the S. aureus EPSPS activity was 
completely resistant to inhibition by 1 mM glyphosate. Previous analysis had 
shown that E. coli GB100 is devoid of EPSPS activity. 

The appK m (PEP) of the S. aureus EPSPS was determined by measuring 
the reaction velocity of the enzyme (in crude bacterial extracts) at varying 
concentrations of PEP. The results were analyzed graphically using several 
standard kinetic plotting methods. Data analysis using the hyperbolic. 
Lineweaver-Burke, and Eadie-Hofstee methods yielded appK m (PEP) constants 
of 7.5, 4.8. and 4.0 p.M. respectively. These three data treatments are in good 
agreement, and yield an average value for appKm(PEP) of 5 [iM. 
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Further information of the glyphosate tolerance of S. aureus EPSPS was 
obtained by determining the reaction rates of the enzyme in the presence of 
several concentrations of glyphosate, at a PEP concentration of 2 yM. These 
results were compared to the calculated maximal velocity of the EPSPS, and 
making the assumption that glyphosate is a competitive inhibitor versus PEP 
for S. aureus EPSPS, as it is for all other characterized EPSPSs, an 
appKiCglyphosate^was determined graphically. The appKj(glyphosate) for S. 
aureus EPSPS estimated using this method was found to be 0.20 mM. 

The EPSPS from S. aureus was found to be glyphosate-tolerant, with an 
appKi(glyphosate) of approximately 0.2 mM. In addition, the appK m (PEP) for 
the enzyme is approximately 5 \iM, yielding a appKi(glyphosate) / appKmCPEP) 
of 40. 

Alternative Isolation Protocols for Other Class II EPSPS Structural 
Genes 

A number of Class II genes have been isolated and described here. While 
the cloning of the gene from CP4 was difficult due to the low degree of similarity 
between the Class I and Class II enzymes and genes, the identification of the 
other genes was greatly facilitated by the use of this first gene as a probe. In 
the cloning of the LBAA EPSPS gene, the CP4 gene probe allowed the rapid 
identification of cosmid clones and the localization of the intact gene to a small 
restriction fragment and some of the CP4 sequencing primers were also used to 
sequence the LBAA (and PG2982) EPSPS gene(s). The CP4 gene probe was 
also used to confirm the PG2982 gene clone. The high degree of similarity of the 
Class II EPSPS genes may be used to identify and clone additional genes in 
much the same way that Class I EPSPS gene probes have been used to clone 
other Class I genes. An example of the latter was in the cloning of the A. 
thaliana EPSPS gene using the P. hybrida gene as a probe (Klee et al, 1987). 

Glyphosate-tolerant EPSPS activity has been reported previously for 
EPSP synthases from a number of sources. These enzymes have not been 
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characterized to any extent in most cases. The use of Class I and Class II 
EPSPS gene probes or antibody probes provide a rapid means of initially 
screening for the nature of the EPSPS and provide tools for the rapid cloning 
and characterization of the genes for such enzymes. 

Two of the three genes described were isolated from bacteria that were 
isolated from a glyphosate treatment facility (Strains CP4 and LBAA). The 
third (PG2982) was from a bacterium that had been isolated from a culture 
collection strain. This latter isolation confirms that exposure to glyphosate is 
not a prerequisite for the isolation of high glyphosate-tolerant EPSPS enzymes 
and that the screening of collections of bacteria could yield additional isolates. 
It is possible to enrich for glyphosate degrading or glyphosate resistant 
microbial populations (Quinn etaL, 1988; Talbot et al., 1984) in cases where it 
was felt that enrichment for such microorganisms would enhance the isolation 
frequency of Class II EPSPS microorganisms. Additional bacteria containing 
class II EPSPS gene have also been identified. A bacterium called C12, isolated 
from the same treatment column beads as CP4 (see above) but in a medium in 
which glyphosate was supplied as both the carbon and phosphorus source, was 
shown by Southern analysis to hybridize with a probe consisting of the CP4 
EPSPS coding sequence. This result, in conjunction with that for strain LBAA, 
suggests that this enrichment method facilitates the identification of Class II 
EPSPS isolates. New bacterial isolates containing Class II EPSPS genes have 
also been identified from environments other than glyphosate waste treatment 
facilities. An inoculum was prepared by extracting soil (from a recently 
harvested soybean field in Jerseyville, Illinois) and a population of bacteria 
selected by growth at 28°C in Dworkin-Foster medium containing glyphosate 
at 10 mM as a source of carbon (and with cycloheximide at 100 ug/ml t0 
prevent the growth of fungi). Upon plating on L-agar media, five colony types 
were identified. Chromosomal DNA was prepared from 2ml L-broth cultures of 
these isolates and the presence of a Class II EPSPS gene was probed using a 
the CP4 EPSPS coding sequence probe by Southern analysis under stringent 
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hybridization and washing conditions. One of the soil isolates, S2, was positive 
by this screen. 

Class II EPSPS enzymes are identifiable by an elevated Ki for 
glyphosate and thus the genes for these will impart a glyphosate tolerance 
phenotype in heterologous hosts. Expression of the gene from recombinant 
plasmids or phage may be achieved through the use of a variety of expression 
promoters and include the T7 promoter and polymerase. The T7 promoter and 
polymerase system has been shown to work in a wide range of bacterial (and 
mammalian) hosts and offers the advantage of expression of many proteins 
that may be present on large cloned fragments. Tolerance to growth on 
glyphosate may be shown on minimal growth media. In some cases, other 
genes or conditions that may give glyphosate tolerance have been observed, 
including over expression of beta-lactamase, the igrA gene (Fitzgibbon and 
Braymer, 1990), or the gene for glyphosate oxidoreductase (PCT Pub. No. 
WO92/00377). These are easily distinguished from Class II EPSPS by the 
absence of EPSPS enzyme activity. 

The EPSPS protein is expressed from the aroA gene (also called aroE in 
some genera, for example, in Bacillus) and mutants in this gene have been 
produced in a wide variety of bacteria. Determining the identity of the donor 
organism (bacterium) aids in the isolation of Class II EPSPS gene - such 
identification may be accomplished by standard microbiological methods and 
could include Gram stain reaction, growth, color of culture , and gas or acid 
production on different substrates, gas chromatography analysis of 
methylesters of the fatty acids in the membranes of the microorganism, and 
determination of the GC% of the genome. The identity of the donor provides 
information that may be used to more easily isolate the EPSPS gene. An 
AroA- host more closely related to the donor organism could be employed to 
clone the EPSPS gene by complementation but this is not essential since 
complementation of the E. coli AroA mutant by the CP4 EPSPS gene was 
observed. In addition, the information on thB GC content the genome may be 
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used in chooosing nucleotide probes - donor sources with high GC% would 
preferably use the CP4 EPSPS gene or sequences as probes and those donors 
with low GC would preferably employ those from Bacillus subtilis, for example. 

Relationship? tefryeen Afferent EPgpg genqs 

The deduced amino acid sequences of a number of Class I and the 
Class II EPSPS enzymes were compared using the Bestfit computer program 
provided in the UWGCG package (Devereux et al. 1984). The degree of 
similarity and identity as determined using this program is reported. The 
degree of similarity/identity determined within Class I and Class II protein 
sequences is remarkably high, for instance, comparing E. coli with S. 
typhimurium (similarity/identity = 93%/88%) and even comparing E. coli with 
a plant EPSPS (Petunia hybrida; 72%/55%). These data are shown in Table IV. 
The comparison of sequences between Class I and Class II, however, shows a 
much lower degree of relatedness between the Classes (similarity/identity = 50- 
53%/23-30%). The display of the Bestfit analysis for the E.coli (SEQ ID NO:8) 
and CP4 (SEQ ID NO:3) sequences shows the positions of the conserved 
residues and is presented in Figure 6. Previous analyses of EPSPS sequences 
had noted the high degree of conservation of sequences of the enzymes and the 
almost invariance of sequences in two regions - the "20-35" and "95-107" 
regions (Gasser et al., 1988; numbered according to the Petunia EPSPS 
sequence) - and these regions are less conserved in the case of CP4 and LBAA 
when compared to Class I bacterial and plant EPSPS sequences (see Figure 6 
for a comparison of the E. coli and CP4 EPSPS sequences with the E. coli 
sequence appearing as the top sequence in the Figure). The corresponding 
sequences in the CP4 Class II EPSPS are: 



PGDKSISHRSFMFGGL (SEQ ID NO:32) and 

LDFGNAATGCRLT (SEQ ID NO:33). 
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These comparisons show that the overall relatedness of Class I and 
Class II is EPSPS proteins is low and that sequences in putative conserved 
regions have also diverged considerably. 

In the CP4 EPSPS an alanine residue is present at the "glycine 101" 
position. The replacement of the conserved glycine (from the "95-107" region) 
by an alanine results in an elevated Ki for glyphosate and in an elevation in the 
Km for PEP in Class I EPSPS. In the case of the CP4 EPSPS, which contains 
an alanine at this position, the K m for PEP is in the low range, indicating that 
the Class II enzymes differ in many aspects from the EPSPS enzymes 
heretofore characterized. 

Within the Class II isolates, the degree of similarity/identity is as high as 
that noted for that within Class I (Table IVA). Figure 7 displays the Bestfit 
computer program alignment of the CP4 (SEQ ID NO:3) and LBAA (SEQ ID 
NO:5) EPSPS deduced amino acid sequences with the CP4 sequence appearing 
as the top sequence in the Figure. The symbols used in Figures 6 and 7 are the 
standard symbols used in the Bestfit computer program to designate degrees 
of similarity and identity. 



-44- 



38-21(10660)A 



Table IV A 1 ^ 



Comparison of relatedness of EPSPS protein seauenrgg 
Comparison between Class I and Class II EPSPS protein sequmi^ 



» similarity identity 

S. cerevisiae vs. CP4 54 30 

A. nidulans vs. CP4 50 25 

B. napus vs. CP4 47 22 
A. thaliana vs. CP4 48 22 
N. tabacum vs. CP4 50 24 
L. esculentum vs. CP4 50 24 
P. Wicto vs. CP4 50 23 
Z. mays vs. CP4 48 24 
S. gallinarum vs. CP4 51 25 
S. typhimurium vs. CP4 51 25 
S. iyp/ii vs. CP4 51 25 
2£ pneumoniae vs. CP4 56 28 
Y". enterocolitica vs. CP4 53 25 
H. influenzae vs. CP4 53 27 
P. multocida vs. CP4 55 30 

A. salmonicida vs. CP4 53 23 

B. pertussis vs. CP4 53 27 

£. coii vs. CP4 52 26 

E. coii vs. LBAA 52 26 

E. coii vs. S. subtilis 55 29 

& coZi vs. D. nodosus 55 32 

E. coii vs. S. aureus 55 29 

S.coft vs. Synechocystis sp. PCC6803 53 30 
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Comparison between Class T FPSPS nrntein seeing^ fl 
fiiw "foritY identity 

E. coli vs. S. typhimurium 93 88 

P. hybrida vs. Z£ coft 72 55 



Comparison between Class H EPSPS nrotein semifine 



similarity identity 

D. nodosus vs. CP4 62 43 

LBAA vs. CP4 90 83 

PG2892 vs. CP4 90 83 

S. aureus vs. CP4 58 34 

B. subtilis vs. CP4 59 41 

Synechocystis sp. PCC6803 vs. CP4 62 45 



1 The EPSPS sequences compared here were obtained from the following 
references: E. coli, Rogers et al., 1983; S. typhimurium, Stalker et al., 1985; 
Petunia hybrida, Shah et al., 1986; B. pertussis, Maskell et al., 1988; S. 
cerevisiae, Duncan et al., 1987, Synechocystis sp. PCC6803, Dalla Chiesa et 
al., 1994 and D. nodosus, Aim et al., 1994. 

2 "GAP" Program, Genetics Computer Group, (1991), Program Manual for 
the GCG Package, Version 7. April 1991. 575 Science Drive. Madison, 
Wisconsin. USA 53711 



The relative locations of the major conserved sequences among Class II 
EPSP synthases which distinguishes this group from the Class I EPSP 
synthases is listed below in Table IVB. 
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CP4 

start 
end 

LBAA 

start 
end 

PG2982 

start 



Cla ss np;p< aF ^ llthnin 
Seq. .? 3 



200 
204 



200 
204 



200 
204 



B. subtilis 

start 



S. aureus 

start 
end 



190 
194 



193 
197 



26 
29 



26 

29 



26 
29 



17 
20 



21 
24 



Synechocystis sp. PCC6803 

start 210 
end 214 



D. nodosus 

start 195 
end 199 

min. start 190 
max. end 214 



1 -R-X1-H-X2-E- (SEQ ID NO:37) 

2 -G-D-K-X3- (SEQ ID NO:38) 

3 -S-A-Q-X4-K- (SEQ ID NO:39) 

4 -N-X5-T-R- (SEQ ID NO:40) 



173 
177 



173 
177 



173 
177 



164 
168 



166 
170 



Sea. 4 4 



271 
274 



271 
274 



273 
276 



257 
260 



261 
264 



34 


183 


278 


38 


187 


281 


22 


168 


261 


25 


172 


264 


17 


164 


257 


38 


187 


281 
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The domains of EPSP synthase sequence identified in this application 
were determined to be those important for maintenance of glyphosate 
resistance and productive binding of PEP. The information used in 
indentifying these domains included sequence alignments of numerous 
glyphosate-sensitive EPSPS molecules and the three-dimensional x-ray 
structures of E. coli EPSPS (Stallings, et al. 1991) and CP4 EPSPS. The 
structures are representative of a glyphosate-sensitive (i.e., Class I) enzyme, 
and a naturally-occuring glyphosate-tolerant (i.e., Class II) enzyme of the 
present invention. These exemplary molecules were superposed three- 
dimensionally and the results displayed on a computer graphics terminal. 
Inspection of the display allowed for structure-based fine-tuning of the 
sequence alignments of glyphosate-sensitive and glyphosate-resistant EPSPS 
molecules. The new sequence alignments were examined to determine 
differences between Class I and Class II EPSPS enzymes. Seven regions were 
identified and these regions were located in the x-ray structure of CP4 EPSPS 
which also contained a bound analog of the intermediate which forms 
catalytically between PEP and S3P. 

The structure of the CP4 EPSPS with the bound intermediate analog 
was displayed on a computer graphics terminal and the seven sequence 
segments were examined. Important residues for glyphosate binding were 
identified as well as those residues which stabilized the conformations of 
those important residues: adjoining residues were considered necessary for 
maintenance of correct three-dimensional structural motifs in the context of 
glyphosate- sensitive EPSPS molecules. Three of the seven domains were 
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determined not to be important for giyphosate tolerance and maintainance of 
productive PEP binding. The following four primary domains were 
determined to be characteristic of Class II EPSPS enzymes of the present 
invention: 

-R-X r H-X 2 -E (SEQ ID NO:37), in which 

Xi is an uncharged polar or acidic amino acid, 
X2 is serine or threonine, 

The Arginine (R) reside at position 1 is important because 
the positive charge of its guanidium group destabilizes the 
binding of giyphosate. The Histidine (H) residue at position 3 
stabilizes the Arginine (R) residue at position 4 of SEQ ID 
NO.40. The Glutamic Acid (E) residue at position 5 stabilizes 
the Lysine (K) residue at position 5 of SEQ ID NO:39. 

-G-D-K-X3 ( SEQ ID NO:38), in which 
X3 is serine or threonine, 

The Aspartic acid (D) residue at position 2 stabilizes the 
Arginine (R) residue at position 4 of SEQ ID NO:40. The Lysine 
(K) residue at position 3 is important because for productive PEP 
binding. 
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-S-A-Q-X 4 -K (SEQ ID NO:39), in which 
X4 is any amino acid, 

The Alanine (A) residue at position 2 stabilizes the 
Arginine (R) residue at position 1 of SEQ ID NO:37. The Serine 
(S) residue at position 1 and the Glutamine (Q) residue at 
position 3 are important for productive S3P binding. 

-N-X5-T-R (SEQ ID NO:40) in which 
X5 is any amino acid, 

The Asparagine (N) residue at position 1 and the 
Threonine (T) residue at position 3 stabilize residue Xi at 
position 2 of SEQ ID NO:37. The Arginine (R) residue at position 
4 is important because the positive charge of its guanidium 
group destabilizes the binding of glyphosate. 



Since the above sequences are only representative of the Class II 
EPSPSs which would be included within the generic structure of this group of 
EPSP synthases, the above sequences may be found within a subject EPSP 
synthase molecule within slightly more expanded regions. It is believed that the 
above-described conserved sequences would likely be found in the following 
regions of the mature EPSP synthases molecule: 
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-R-X1-H-X2-E- (SEQ ID NO:37) located between amino acids 175 and 230 

of the mature EPSP synthase sequence; 
-G-D-K-X3- (SEQ ID NO:38) located between amino acids 5 and 55 of the 

mature EPSP synthase sequence; 
-S-A-Q-X4-K- (SEQ ID NO:39) located between amino acids 150 and 200 of the 

mature EPSP synthase sequence; and 
-N-X5-T-R- (SEQ iD NO:40) located between amino acids 245 and 295 of the 

mature EPSPS synthase sequence. 

One difference that may be noted between the deduced amino acid 
sequences of the CP4 and LBAA EPSPS proteins is at position 100 where an 
Alanine is found in the case of the CP4 enzyme and a Glycine is found in the 
case of the LBAA enzyme. In the Class I EPSPS enzymes a Glycine is usually 
found in the equivalent position, i.e Glycine96 in E. coli and if. pneumoniae and 
Glycine 101 in Petunia. In the case of these three enzymes it has been 
reported that converting that Glycine to an Alanine results in an elevation of 
the appKi for glyphosate and a concomitant elevation in the appKm for PEP 
(Kishore et al., 1986; Kishore and Shah, 1988; Sost and Amrhein, 1990), which, 
as discussed above, makes the enzyme less efficient especially under 
conditions of lower PEP concentrations. The GlycinelOO of the LBAA EPSPS 
was converted to an Alanine and both the appKm for PEP and the appKi for 
glyphosate were determined for the variant. The Glycine lOOAlanine. change 
was introduced by mutagenesis using the following primer: 

CGGCAATGCCGCCACCGGCGCGCGCC (SEQ ID NO:34) 

and both the wild type and variant genes were expressed in E. coli in a RecA 
promoter expression vector (pMON17201 and pMON17264, respectively) and 
the appKm's and appKi's determined in crude lysates. The data indicate that 
the appKi(glyphosate) for the G100A variant is elevated about 16-fold (Table 
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V). This result is in agreement with the observation of the importance of this 
G-A change in raising the appKi(glyphosate) in the Class I EPSPS enzymes. 
However, in contrast to the results in the Class I G-A variants, the 
appKm(PEP) in the Class II (LBAA) G-A variant is unaltered. This provides 
yet another distinction between the Class II and Class I EPSPS enzymes. 

a ppKm(PEP) 

Lysate prepared from: 

E. co/i/pMON17201 (wild type) 5.3 yM 

E. coft/pMON17264 5.5 yM 
(G100A variant) 

@ range of PEP: 2-40 \iM 
* range of glyphosate: 0-310 uM; # range of glyphosate: 0-5000 uM. 

The LBAA G100A variant, by virtue of its superior kinetic properties, should 
be capable of imparting improved in planta glyphosate tolerance. 

Modificati on and Resvnthesis of the Aerobaoterium sp, sfram GYi 

F. PSPS Gene Sequence 

The EPSPS gene from Agrobacterium sp. strain CP4 contains 
sequences that could be inimical to high expression of the gene in plants. These 
sequences include potential polyadenylation sites that are often and A+T rich, 
a higher G+C% than that frequently found in plant genes (63% versus -50%), 
concentrated stretches of G and C residues, and codons that are not used 
frequently in plant genes. The high G+C% in the CP4 EPSPS gene has a 
number of potential consequences including the following: a higher usage of G or 
C than that found in plant genes in the third position in codons, and the 



appKi( glyphosate) 

28 fiM* 
459 uM# 
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potential to form strong hair-pin structures that may affect expression or 
stability of the RNA. The reduction in the G+C content of the CP4 EPSPS 
gene, the disruption of stretches of G's and C's, the elimination of potential 
polyadenylation sequences, and improvements in the codon usage to that used 
more frequently in plant genes, could result in higher expression of the CP4 
EPSPS gene in plants. 

A synthetic CP4 gene was designed to change as completely as possible 
those inimical sequences discussed above. In summary, the gene sequence was 
redesigned to eliminate as much as possible the following sequences or 
sequence features (while avoiding the introduction of unnecessary restriction 
sites): stretches of G's and C's of 5 or greater; and A+T rich regions 
(predominantly) that could function as polyadenylation sites or potential RNA 
destabilization region The sequence of this gene is shown in Figure 8 (SEQ ID 
NO:9). This coding sequence was expressed in E. coli from the RecA promoter 
and assayed for EPSPS activity and compared with that from the native CP4 
EPSPS gene. The apparent Km for PEP for the native and synthetic genes 
was 11.8 and 12.7, respectively, indicating that the enzyme expressed from the 
synthetic gene was unaltered. The N-terminus of the coding sequence was 
mutagenized to place an SphI site at the ATG to permit the construction of the 
CTP2-CP4 synthetic fusion for chloroplast import. The following primer was 
used to accomplish this mutagenesis: 

GGACGGCTGCTTGCACCGTGAAGCATGCTTAAGCTTGGCGTAATCATGG 
(SEQ ID NO:35). 

E gression o f Chlnrotilast Directed CP4 EPSPS 

The glyphosate target in plants, the 5-enolpyruvyl-shikimate-3- 
phosphate synthase (EPSPS) enzyme, is located in the chloroplast. Many 
chloroplast-localized proteins, including EPSPS, are expressed from nuclear 
genes as precursors and are targeted to the chloroplast by a chloroplast 
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transit peptide (CTP) that is removed during the import steps. Examples of 
other such chloropiast proteins include the small subunit (SSU) of Ribulose- 
1,5-bisphosphate carboxylase (RUBISCO), Ferredoxin, Ferredoxin 
oxidoreductase, the Light-harvesting-complex protein I and protein II, and 
Thioredoxin F. It has been demonstrated in vivo and in vitro that non- 
chloroplast proteins may be targeted to the chloropiast by use of protein 
fusions with a CTP and that a CTP sequence is sufficient to target a protein to 
the chloropiast. 

A CTP-CP4 EPSPS fusion was constructed between the Arabidopsis 
thaliana EPSPS CTP (Klee et al., 1987) and the CP4 EPSPS coding sequences. 
The Arabidopsis CTP was engineered by site-directed mutagenesis to place a 
Sphl restriction site at the CTP processing site. This mutagenesis replaced the 
Glu-Lys at this location with Cys-Met. The sequence of this CTP, designated as 
CTP2 (SEQ ID NO: 10), is shown in Figure 9. The N-terminus of the CP4 
EPSPS gene was modified to place a Sphl site that spans the Met codon. The 
second codon was converted to one for leucine in this step also. This change had 
no apparent effect on the in vivo activity of CP4 EPSPS in E. coli as judged by 
rate of complementation of the aroA allele. This modified N-terminus was then 
combined with the Sacl C-terminus and cloned downstream of the CTP2 
sequences. The CTP2-CP4 EPSPS fusion was cloned into pBlueScript KS(+). 
This vector may be transcribed in vitro using the T7 polymerase and the RNA 
translated with 35S-Methionine to provide material that may be evaluated for 
import into chloroplasts isolated from Lactuca sativa using the methods 
described hereinafter idella-Cioppa et aL, 1986, 1987). This template was 
transcribed in vitro using T7 polymerase and the 35S-methionine-labeled 
CTP2-CP4 EPSPS material was shown to import into chloroplasts with an 
efficiency comparable to that for the control Petunia EPSPS (control = 35S 
labeled PreEPSPS [pMON6140; della-Cioppa et aU 1986]). 

In another example the Arabidopsis EPSPS CTP, designated as CTP3, 
was fused to the CP4 EPSPS through an £coRI site. The sequence of this 
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CTP3 (SEQ ID NO: 12) is shown in Figure 10. An EcoRl site was introduced 
into the Arabidopsis EPSPS mature region around amino acid 27, replacing the 
sequence -Arg-Ala-Leu-Leu- with -Arg-Ile-Leu-Leu- in the process. The primer 
of the following sequence was used to modify the N-terminus of the CP4 
EPSPS gene to add an £coRI site to effect the fusion to the 

CTP3:GGAAGACGCCCASMHQACGGTGCAAGCAGCCGG 
(SEQ ED NO:36) (the^coRI site is underlined. 

This CTP3-CP4 EPSPS fusion was also cloned into the pBlueScript vector and 
the T7 expressed fusion was found to also import into chloroplasts with an 
efficiency comparable to that for the control Petunia EPSPS (pMON6140). 

A related series of CTPs, designated as CTP4 (SphI) and CTP5 (EcoRI), 
based on the Petunia EPSPS CTP and gene were also fused to the Sphl- and 
jBcoRI-modified CP4 EPSPS gene sequences. The Sphl site was added by site- 
directed mutagenesis to place this restriction site (and change the amino acid 
sequence to -Cys-Met-) at the chloroplast processing site. All of the CTP-CP4 
EPSPS fusions were shown to import into chloroplasts with approximately 
equal efficiency. The CTP4 (SEQ ID NO:14) and CTP5 (SEQ ID NO:16) 
sequences are shown in Figures 11 and 12. 

A CTP2-LBAA EPSPS fusion was also constructed following the 
modification of the N-terminus of the LBAA EPSPS gene by the addition of a 
Sphl site. This fusion was also found to be imported efficiently into 
chloroplasts. 

By similar approaches, the CTP2-CP4 EPSPS and the CTP4-CP4 
EPSPS fusion have also been shown to import efficiently into chloroplasts 
prepared from the leaf sheaths of corn. These results indicate that these CTP- 
CP4 fusions could also provide useful genes to impart glyphosate tolerance in 
monocot species. 
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The use of CTP2 or CTP4 is preferred because these transit peptide 
constructions yield mature EPSPS enzymes upon import into the chloropiat 
which are closer in composition to the native EPSPSs not containing a transit 
peptide signal. Those skilled in the art will recognize that various chimeric 
constructs can be made which utilize the functionality of a particular CTP to 
import a Class II EPSPS enzyme into the plant cell chloroplast. The 
chloroplast import of the Class II EPSPS can be determined using the following 
assay. 

Chloroplast Uptake Assay 

Intact chloroplasts are isolated from lettuce (Latuca sativa, var. 
longifolia) by centrimgation in Percoll/ficoll gradients as modified from Bartlett 
etal., (1982). The final pellet of intact chloroplasts is suspended in 0.5 ml of 
sterile 330 mM sorbitol in 50 mM Hepes-KOH, pH 7.7, assayed for chlorophyll 
(Arnon, 1949), and adjusted to the final chlorophyll concentration of 4 mg/ml 
(using sorbitol/Hepes). The yield of intact chloroplasts from a single head of 
lettuce is 3-6mg chlorophyll. 

A typical 300 (il uptake experiment contained 5 mM ATP, 8.3 mM 
unlabeled methionine, 322 mM sorbitol, 58.3 mM Hepes-KOH (pH 8.0), 50 ul 
reticulocyte lysate translation products, and intact chloroplasts from L. sativa 
(200 ug chlorophyll). The uptake mixture is gently rocked at room temperature 
(in 10 x 75 mm glass tubes) directly in front of a fiber optic illuminator set at 
maximum light intensity (150 Watt bulb). Aliquot samples of the uptake mix 
(about 50 ul) are removed at various times and fractionated over 100 ul 
silicone-oil gradients (in 150 ul polyethylene tubes) by centrimgation at 11,000 
X g for 30 seconds. Under these conditions, the intact chloroplasts form a pellet 
under the silicone-oil layer and the incubation medium (containing the 
reticulocyte lysate) floats on the surface. After centrifugation, the silicone-oil 
gradients are immediately frozen in dry ice. The chloroplast pellet is then 
resuspended in 50-100 ul of lysis buffer (Itt.mM Hepes-KOH pH 7.5, 1 mM 
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PMSF, 1 mM benzamidine, 5 mM e-amino-n-caproic acid, and 30 ug/ml 
aprotinin) and centrifiiged at 15,000 X g for 20 minutes to pellet the thylakoid 
membranes. The clear supernatant (stromal proteins) from this spin, and an 
aliquot of the reticulocyte lysate incubation medium from each uptake 
experiment, are mixed with an equal volume of 2X SDS-PAGE sample buffer 
for electrophoresis (Laemmli, 1970). 

SDS-PAGE is carried out according to Laemmli (1970) in 3-17% (w/v) 
acrylamide slab gels (60 mm X 1.5 mm) with 3% (w/v) acrylamide stacking gels 
(5 mm X 1.5 mm). The gel is fixed for 20-30 min in a solution with 40% 
methanol and 10% acetic acid. Then, the gel is soaked in EN3HANCE™ 
(DuPont) for 20-30 minutes, followed by drying the gel on a gel dryer. The gel is 
imaged by autoradiography, using an intensifying screen and an overnight 
exposure to determine whether the CP4 EPSPS is imported into the isolated 
chloroplasts. 

Plant Transformation 

Plants which can be made glyphosate-tolerant by practice of the 
present invention include, but are not limited to, soybean, cotton, corn, canola, 
oil seed rape, flax, sugarbeet, sunflower, potato, tobacco, tomato, wheat, rice, 
alfalfa and lettuce as well as various tree, nut and vine species. 

A double-stranded DNA molecule of the present invention ("chimeric 
gene") can be inserted into the genome of a plant by any suitable method. 
Suitable plant transformation vectors include those derived from a Ti plasmid 
of Agrobacterium tumefaciens, as well as those disclosed, e.g., by Herrera- 
Estrella (1983), Bevan (1984), Klee (1985) and EPO publication 120,516 
(Schilperoort et alX In addition to plant transformation vectors derived from 
the Ti or root-inducing ^Ri) plasmids of Agrobacterium. alternative methods can 
be used to insert the DNA constructs of this invention into plant cells. Such 
methods may involve, for example, the use of liposomes, electroporation. 
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chemicals that increase free DNA uptake, free DNA delivery via 
microprojectile bombardment, and transformation using viruses or pollen. 

Class n EPSPS Plant transformation vftotors 

Class II EPSPS DNA sequences may be engineered into vectors 
capable of transforming plants by using known techniques. The following 
description is meant to be illustrative and not to be read in a limiting sense. 
One of ordinary skill in the art would know that other plasmids, vectors, 
markers, promoters, etc. would be used with suitable results. The CTP2-CP4 
EPSPS fusion was cloned as a BglH-EcoRl fragment into the plant vector 
pMON979 (described below) to form pMON17110, a map of which is presented 
in Figure 13. In this vector the CP4 gene is expressed from the enhanced 
CaMV35S promoter (E35S; Kay et al. 1987). A FMV35S promoter construct 
(pMON17116) was completed in the following way: The Sall-Notl and the 
Notl-Bglll fragments from pMON979 containing the Spc/AAC(3)-III/oriV and 
the pBR322/Right Border/NOS 37CP4 EPSPS gene segment from 
pMON17110 were ligated with the Xhol-Bglll FMV35S promoter fragment 
from pMON981. These vectors were introduced into tobacco, cotton and 
canola. 

A series of vectors was also completed in the vector pMON977 in which 
the CP4 EPSPS gene, the CTP2-CP4 EPSPS fusion, and the CTP3-CP4 fusion 
were cloned as Bglll-Sacl fragments to form pMON17124, pMON17119, and 
pMON17120, respectively. These plasmids were introduced into tobacco. A 
pMON977 derivative containing the CTP2-LBAA EPSPS gene was also 
completed (pMON17206) and introduced into tobacco. 

The pMON979 plant transformation/expression vector was derived from 
pMON886 (described below) by replacing the neomycin phosphotransferase 
typell (KAN) gene in pMON886 with the 0.89 kb fragment containing the 
bacterial gentamicin-3-N-acetyltransferase type III (AAC(3)-III) gene 
(Hayford et al., 1988). The chimeric P-35S/AA(3)-III/NOS 3' gene encodes 
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gentamicin resistance which permits selection of transformed plant cells. 
pMON979 also contains a 0.95 kb expression cassette consisting of the 
enhanced CaMV 35S promoter (Kay et al., 1987), several unique restriction 
sites, and the NOS 3' end (P-En-CaMV35S/NOS 3'). The rest of the pMON979 
DNA segments are exactly the same as in pMON886. 

Plasmid pMON886 is made up of the following segments of DNA. The 
first is a 0.93 kb Aual to engineered- EcoRV fragment isolated from transposon 
Tn7 that encodes bacterial spectinomycin/streptomycin resistance (Spc/Str), 
which is a determinant for selection in E. coli and Agrobacterium tumefaciens. 
This is joined to the 1.61 kb segment of DNA encoding a chimeric kanamycin 
resistance which permits selection of transformed plant cells. The chimeric 
gene (P-35S/KAN/N OS 3') consists of the cauliflower mosaic virus (CaMV) 35S 
promoter, the neomycin phosphotransferase typell (KAN) gene, and the 
3'-nontranslated region of the nopaline synthase gene (NOS 3') (Fraley et al., 
1983). The next segment is the 0.75 kb orN containing the origin of replication 
from the RK2 plasmid. It is joined to the 3.1 kb Sail to Pvul segment of 
pBR322 (orz322) which provides the origin of replication for maintenance in 
E. coli and the bom site for the conjugational transfer into the Agrobacterium 
tumefaciens cells. The next segment is the 0.36 kb Pvul to Bell from pTiT37 
that carries the nopaline-type T-DNA right border (Fraley et al, 1985). 

The pMON977 vector is the same as pMON981 except for the presence 
of the P-En-CaMV35S promoter in place of the FMV35S promoter (see below). 

The pMON981 plasmid contains the following DNA segments: the 0.93 
kb fragment isolated from transposon Tn7 encoding bacterial 
spectinomycin/streptomycin resistance [Spc/Str; a determinant for selection in 
E. coli and Agrobacterium tumefaciens (Fling et al., 1985)3; the chimeric 
kanamycin resistance gene engineered for plant expression to allow selection of 
the transformed tissue, consisting of the 0.35 kb cauliflower mosaic virus 35S 
promoter (P-35S) (Odell et al., 1985), the 0.83 kb neomycin 
phosphotransferase typell gene (KAN), and the 0.26 kb 3'-nontranslated 
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region of the nopaline synthase gene (NOS 3') (Fraiey et a/., 1983); the 0.75 kb 
origin of replication from the RK2 plasmid (oriV) (Stalker etaL, 1981); the 3.1 
kb Sail to Pvul segment of pBR322 which provides the origin of replication for 
maintenance in E. coli (ori-322) and the bom site for the conjugational transfer 
into the Agrobacterium tumefaciens cells, and the 0.36 kb Pvul to Bell 
fragment from the pTiT37 plasmid containing the nopaline-type T-DNA right 
border region (Fraiey et aL, 1985). The expression cassette consists of the 0.6 
kb 35S promoter from the figwort mosaic virus (P-FMV35S) (Gowda et al. y 
1989) and the 0.7 kb 3' non-translated region of the pea rbcS-E9 gene (E9 3') 
(Coruzzi et aL, 1984, and Morelli et aL, 1985). The 0.6 kb Sspl fragment 
containing the FMV35S promoter (Figure 1) was engineered to place suitable 
cloning sites downstream of the transcriptional start site. The CTP2-CP4syn 
gene fusion was introduced into plant expression vectors (including pMON981, 
to form pMON17131; Figure 14) and transformed into tobacco, canola, potato, 
tomato, sugarbeet, cotton, lettuce, cucumber, oil seed rape, poplar, and 
Arabidopsis. 

The plant vector containing the Class II EPSPS gene may be mobilized 
into any suitable Agrobacterium strain for transformation of the desired plant 
species. The plant vector may be mobilized into an ABI Agrobacterium strain. 
A suitable ABI strain is the A208 Agrobacterium tumefaciens carrying the 
disarmed Ti plasmid pTiC58 (pMP90RK) (Koncz and Schell, 1986). The Ti 
plasmid does not carry the T-DNA phytohormone genes and the strain is 
therefore unable to cause the crown gall disease. Mating of the plant vector 
into ABI was done by the triparental conjugation system using the helper 
plasmid pRK2013 (Ditta et aL, 1980). When the plant tissue is incubated with 
the ABI::plant vector conjugate, the vector is transferred to the plant cells by 
the vir functions encoded by the disarmed P TiC58 plasmid. The vector opens at 
the T-DNA right border region, and the entire plant vector sequence may be 
inserted into the host plant chromosome. The pTiC58 Ti plasmid does not 
transfer to the plant cells but remains in the Agrobacterium. 
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Class n EPSPS free DNA vectors 

Class II EPSPS genes may also be introduced into plants through direct 
delivery methods. A number of direct delivery vectors were completed for the 
CP4 EPSPS gene. The vector pMON13640, a map of which is presented in 
Figure 15, is described here. The plasmid vector is based on a pUC plasmid 
(Vieira and Messing, 1987) containing, in this case, the nptYL gene (kanamycin 
resistance; KAN) from Tn903 to provide a selectable marker in E. coli. The 
CTP4-EPSPS gene fusion is expressed from the P-FMV35S promoter and 
contains the NOS 3' polyadenylation sequence fragment and from a second 
cassette consisting of the E35S promoter, the CTP4-CP4 gene fusion and the 
NOS 3' sequences. The scoreable GUS marker gene (Jefferson et al., 1987) is 
expressed from the mannopine synthase promoter (P-MAS; Velten et al. y 1984) 
and the soybean 7S storage protein gene 3' sequences (Schuler et al., 1982). 
Similar plasmids could also be made in which CTP-CP4 EPSPS fusions are 
expressed from the enhanced CaMV35S promoter or other plant promoters. 
Other vectors could be made that are suitable for free DNA delivery into plants 
and such are within the skill of the art and contemplated to be within the scope 
of this disclosure. 

Piasiid &2B5fgj3Batiaiii 

While transformation of the nuclear genome of plants is much more developed 
at this time, a rapidly advancing alternative is the transformation of plant 
organelles. The transformation of plastids of land plants and the regeneration 
of stable transformants has been demonstrated (Svab et al., 1990; Maliga et 
al., 1993). Transformants are selected, following double cross-over events into 
the plastid genome, on the basis of resistance to spectinomycin conferred 
through rRNA changes or through the introduction of an aminoglycoside 3"- 
adenyltransferase gene (Svab et al., 1990: Svab and Maliga, 1993), or 
resistance to kanamycin through the neomycin phosphotransferase Nptll 
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(Carrer et al., 1993). DNA is introduced by biolistic means (Svab et al, 1990; 
Maliga et al., 1993) or by using polyethylene glycol (O'Neill et al., 1993). This 
transformation route results in the production of 500-10,000 copies of the 
introduced sequence per cell and high levels of expression of the introduced gene 
have been reported (Carrer et al., 1993; Maliga et al., 1993). The use of plastid 
transformation offers the adavantages of not requiring the chloroplast transit 
peptide signal sequence to result in the localization of the heterologous Class II 
EPSPS in the chloroplast and the potential to have many copies of the 
heterologous plant-expressible Class II EPSPS gene in each plant cell since at 
least one copy of the gene would be in each plastid of the cell. 

Plant Regeneration 

When expression of the Class II EPSPS gene is achieved in transformed 
cells (or protoplasts), the cells (or protoplasts) are regenerated into whole 
plants. Choice of methodology for the regeneration step is not critical, with 
suitable protocols being available for hosts from Leguminosae (alfalfa, 
soybean, clover, etc.), Umbelliferae (carrot, celery, parsnip), Cruciferae 
(cabbage, radish, rapeseed, etc.), Cucurbitaceae (melons and cucumber), 
Gramineae (wheat, rice, corn, etc.), Solanaceae (potato, tobacco, tomato, 
peppers), various floral crops as well as various trees such as poplar or apple, 
nut crops or vine plants such as grapes. See, e.g., Ammirato, 1984; 
Shimamoto, 1989; Fromm, 1990; Vasil, 1990. 

The following examples are provided to better elucidate the practice of 
the present invention and should not be interpreted in any way to limit the 
scope of the present invention. Those skilled in the art will recognize that 
various modifications, truncations, etc. can be made to the methods and genes 
described herein while not departing from the spirit and scope of the present 
invention. 

In the examples that follow, EPSPS activity in plants is assayed by the 
following method. Tissue samples were collected and immediately frozen in 
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liquid nitrogen. One gram of young leaf tissue was frozen in a mortar with liquid 
nitrogen and ground to a fine powder with a pestle. The powder was then 
transferred to a second mortar, extraction buffer was added (1 ml /gram), and 
the sample was ground for an additional 45 seconds. The extraction buffer for 
canola consists of 100 mM Tris, 1 mM EDTA, 10 % glycerol, 5 mM DTT, 1 mM 
BAM, 5 mM aseorpate, 1.0 mg/ml BSA, pH 7.5 (4°C). The extraction buffer for 
tobacco consists of 100 mM Tris, 10 mM EDTA, 35 mM KC1, 20 % glycerol, 5 
mM DTT, 1 mM BAM, 5 mM ascorbate, 1.0 mg/ml BSA, pH 7.5 (4°C). The 
mixture was transferred to a microfuge tube and centrifuged for 5 minutes. The 
resulting supernatants were desalted on spin G-50 (Pharmacia) columns, 
previously equilibrated with extraction buffer (without BSA), in 0.25 ml 
aliquots. The desalted extracts were assayed for EPSP synthase activity by 
radioactive HPLC assay. Protein concentrations in samples were determined 
by the BioRad microprotein assay with BSA as the standard. 

Protein concentrations were determined using the BioRad Microprotein 
method. BSA was used to generate a standard curve ranging from 2 - 24 |ig. 
Either 800 ul of standard or diluted sample was mixed with 200 jxl of 
concentrated BioRad Bradford reagent. The samples were vortexed and read at 
A(595) after ~ 5 minutes and compared to the standard curve. 

EPSPS enzyme assays contained HEPES (50 mM), shikimate-3- 
phosphate (2 mM), NH 4 molybdate (0.1 mM) and KF (5 mM), with or without 
glyphosate (0.5 or 1.0 mM). The assay mix (30 ul) and plant extract (10 ul) 
were preincubated for 1 minute at 25°C and the reactions were initiated by 
adding 14C-PEP (1 mM). The reactions were quenched after 3 minutes with 50 
111 of 90% EtOH/O.lM HOAc, pH 4.5. The samples were spun at 6000 rpm and 
the resulting supernatants were analyzed for ^C-EPSP production by HPLC. 
Percent resistant EPSPS is calculated from the EPSPS activities with and 
without glyphosate. 

The percent conversion of i*C labeled PEP to i*C EPSP was determined 
by HPLC radioassay using a C18 guard column (Brownlee) and an AX100 
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HPLC column (0.4 X 25 cm, Synchropak) with 0.28 M isocratic potassium 
phosphate eluant, pH 6.5, at 1 ml/min. Initial velocities were calculated by 
multiplying fractional turnover per unit time by the initial concentration of the 
labeled substrate (1 mM). The assay was linear with time up to ~ 3 minutes 
and 30% turnover to EPSPS. Samples were diluted with 10 mM Tris, 10% 
glycerol, 10 mM DTT, pH 7.5 (4°C) if necessary to obtain results within the 
linear range. 

In these assays DL-dithiotheitol (DTT), benzamidine (BAM), and bovine 
serum albumin (BSA, essentially globulin free) were obtained from Sigma. 
Phosphoeno^pyruvate (PEP) was from Boehringer Mannheim and 
phosphoeno^l- 14 C]pyruvate (28 mCi/mmol) was from Amersham. 

EXAMPLES 

Example 1 

Transformed tobacco plants have been generated with a number of the 
Class II EPSPS gene vectors containing the CP4 EPSPS DNA sequence as 
described above with suitable expression of the EPSPS. These transformed 
plants exhibit glyphosate tolerance imparted by the Class II CP4 EPSPS. 

Transformation of tobacco employs the tobacco leaf disc transformation 
protocol which utilizes healthy leaf tissue about 1 month old. After a 15-20 
minutes surface sterilization with 10% Clorox plus a surfactant, the leaves are 
rinsed 3 times in sterile water. Using a sterile paper punch, leaf discs are 
punched and placed upside down on MS104 media (MS salts 4.3 g/1, sucrose 30 
g/1, B5 vitamins 500X 2 ml/1, NAA 0.1 mg/1, and BA 1.0 mg/T) for a 1 day 
preculture. 

The discs are then inoculated with an overnight culture of a disarmed 
Agrobacterium ABI strain containing the subject vector that had been diluted 
1/5 (i.e.: about 0.6 OD). The inoculation is done by placing the discs in 
centrifuge tubes with the culture. After 30 to 60 seconds, the liquid is drained 
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off and the discs were blotted between steriie filter paper. The discs are then 
placed upside down on MS104 feeder plates with a filter disc to co-culture. 

After 2-3 days of co-culture, the discs are transferred, still upside down, 
to selection plates with MS 104 media. After 2-3 weeks, callus tissue formed, 
and individual clumps are separated from the leaf discs. Shoots are cleanly cut 
from the callus when they are large enough to be distinguished from stems. The 
shoots are placed on hormone-free rooting media (MSO: MS salts 4.3 g/1, 
sucrose 30 g/1, and B5 vitamins 500X 2 ml/1) with selection for the appropriate 
antibiotic resistance. Root formation occurred in 1-2 weeks. Any leaf callus 
assays are preferably done on rooted shoots while still sterile. Rooted shoots 
are then placed in soil and kept in a high humidity environment (i.e.: plastic 
containers or bags). The shoots are hardened off by gradually exposing them to 
ambient humidity conditions. 

Expression of CP4 EPSPg protein in transformed plants 

Tobacco cells were transformed with a number of plant vectors 
containing the native CP4 EPSPS gene, and using different promoters and/or 
CTP's. Preliminary evidence for expression of the gene was given by the ability 
of the leaf tissue from antibiotic selected transformed shoots to recallus on 
glyphosate. In some cases, glyphosate-tolerant callus was selected directly 
following transformation. The level of expression of the CP4 EPSPS was 
determined by the level of glyphosate-tolerant EPSPS activity (assayed in the 
presence of 0.5 mM glyphosate) or by Western blot analysis using a goat anti- 
CP4 EPSPS antibody. The Western blots were quantitated by densitometer 
tracing and comparison to a standard curve established using purified CP4 
EPSPS. These data are presented as % soluble leaf protein. The data from a 
number of transformed plant lines and transformation vectors are presented in 
Table VT below. 
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Table VI Expression of CP4 EPSPS in transformed t o harm ti g «, T f 
Vector Plant # CP4 EPSPS ** 

(% leaf protein) 



pMON17110 


25313 


0.02 


pMON17110 i 


25329 


0.04 


pMON17116 


25095 


0.02 


pMON17119 


25106 


0.09 


pMON17119 


25762 


0.09 


pMON17119 


25767 


0.03 



** Glyphosate-tolerant EPSPS activity was also demonstrated in leaf extracts 
for these plants. 

Glyphosate tolerance has also been demonstrated at the whole 
plant level in transformed tobacco plants. In tobacco, R Q transformants of 
CTP2-CP4 EPSPS were sprayed at 0.4 lb/acre (0.448 kg/hectare), a rate 
sufficient to kill control non-transformed tobacco plants corresponding to a 
rating of 3, 1 and 0 at days 7, 14 and 28. respectively, and were analyzed 
vegetatively and reproductively (Table VII). 



-66- 



38-21(10660)A 



Table Vn Glvphosate tolerant in tobacco 0P4 transfnt^ fl Tltff * 
Vector/Plant # 



pMON17110/25313 
pMON17110/25329 
P MON17119/25106 

* Spray rate = 0.4 lb/acre (0.448kg/hectare) 

**Plants are evaluated on a numerical scoring system of 0-10 where a 
vegetative score of 10 represents no damage relative to nonsprayed controls 
and 0 represents a dead plant. Reproductive scores (Fertile) are determined at . 
28 days after spraying and are evaluated as to whether or not the plant is 
fertile. 

Ssampfc 2A 

Canola plants were transformed with the pMON17110, pMON17116, 
and pMON17131 vectors and a number of plant lines of the transformed 
canola were obtained which exhibit glyphosate tolerance. 

Plant Material 

Seedlings of Brassica napus cv Westar were established in 2 inch (~ 5 
cm) pots containing Metro Mix 350. They were grown in a growth chamber at 
24°C, 16/8 hour photoperiod, light intensity of 400 uEm-2sec-l (HID lamps). 
They were fertilized with Peters 20-10-20 General Purpose Special. After 2 1/2 
weeks they were transplanted to 6 inch (~ 15 cm) pots and grown in a growth 
chamber at 15/10°C day/night temperature, 16/8 hour photoperiod, light 



Score** 

Vegetative Fertile 
day 7 day 14 day 28 

6 4 2 no 

9 10 10 yes 

9 9 10 yes 
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intensity of 800 uEm-2sec-i (HID lamps). They were fertilized with Peters 15- 
30-15 Hi-Phos Special. 

Transformation/Selection/Regeneration 

Four terminal internodes from plants just prior to bolting or in the 
process of boltingibut before flowering were removed and surfaced sterilized in 
70% v/v ethanol for 1 minute, 2% w/v sodium hypochlorite for 20 minutes and 
rinsed 3 times with sterile deionized water. Stems with leaves attached could 
be refrigerated in moist plastic bags for up to 72 hours prior to sterilization. Six 
to seven stem segments were cut into 5mm discs with a Redco Vegetable 
Sheer 200 maintaining orientation of basal end. 

The Agrobacterium was grown overnight on a rotator at 24°C in 2mls of 
Luria Broth containing 50mg/l kanamycin, 24mg/l chloramphenicol and 100 
mg/1 spectinomycin. A 1:10 dilution was made in MS (Murashige and Skoog) 
media giving approximately 9x108 cells per ml. This was confirmed with optical 
density readings at 660 mu. The stem discs (explants) were inoculated with 1.0 
ml of Agrobacterium and the excess was aspirated from the explants. 

The explants were placed basal side down in petri plates containing 
1/10X standard MS salts. B5 vitamins. 3% sucrose, 0.8% agar, pH 5.7, 1.0 mg/1 
6-benzyladenine (BA). The plates were layered with 1.5 ml of media containing 
MS salts, B5 vitamins, 3% sucrose, pH 5.7, 4.0 mg/1 p-chlorophenoxyacetic 
acid, 0.005 mg/1 kinetin and covered with sterile filter paper. 

Following a 2 to 3 day co-culture, the explants were transferred to deep 
dish petri plates containing MS salts, B5 vitamins, 3% sucrose, 0.8% agar, pH 
5.7, 1 mg/1 BA. 500 mg/1 carbenicillin, 50mg/l cefotaxime, 200 mg/1 kanamycin 
or 175 mg/1 gentamicin for selection. Seven explants were placed on each plate. 
After 3 weeks they were transferred to fresh media, 5 explants per plate. The 
explants were cultured in a growth room at 25°C, continuous light (Cool White). 
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Espressfon Assay 

After 3 weeks shoots were excised from the explants. Leaf recallusing 
assays were initiated to confirm modification of R 0 shoots. Three tiny pieces of 
leaf tissue were placed on recallusing media containing MS salts, B5 vitamins, 
3% sucrose, 0.8% agar, pH 5.7, 5.0mg/l BA, 0.5 mg/1 naphthalene acetic acid 
(NAA), 500 mg/1 carbenicillin, 50mg/l cefotaxime and 200 mg/1 kanamycin or 
gentamicin or 0.5 mM glyphosate. The leaf assays were incubated in a growth 
room under the same conditions as explant culture. After 3 weeks the leaf 
recallusing assays were scored for herbicide tolerance (callus or green leaf 
tissue) or sensitivity (bleaching). 

Transplantation 

At the time of excision, the shoot stems were dipped in Rootone® and 
placed in 2 inch (~ 5 cm) pots containing Metro-Mix 350 and placed in a closed 
humid environment. They were placed in a growth chamber at 24°C, 16/8 hour 
photoperiod, 400 uEm-isec-2(HID lamps) for a hardening-off period of 
approximately 3 weeks. 

The seed harvested from R 0 plants is Ri seed which gives rise to Ri 
plants. To evaluate the glyphosate tolerance of an R 0 plant, its progeny are 
evaluated. Because an R 0 plant is assumed to be hemizygous at each insert 
location, selling results in maximum genotypic segregation in the Ri. Because 
each insert acts as a dominant allele, in the absence of linkage and assuming 
only one hemizygous insert is required for tolerance expression, one insert 
would segregate 3:1, two inserts, 15:1, three inserts 63:1, etc. Therefore, 
relatively few R\ plants need be grown to find at least one resistant phenotype. 

Seed from an R 0 plant is harvested, threshed, and dried before planting 
in a glyphosate spray test. Various techniques have been used to grow the 
plants for R i spray evaluations. Tests are conducted in both greenhouses and 
growth chambers. Two planting systems are used; ~ 10 cm pots or plant trays 
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containing 32 or 36 cells. Soil used for planting is either Metro 350 plus three 
types of slow release fertilizer or plant Metro 350. Irrigation is either overhead 
in greenhouses or sub-irrigation in growth chambers. Fertilizer is applied as 
required in irrigation water. Temperature regimes appropriate for canola were 
maintained. A sixteen hour photoperiod was maintained. At the onset of 
flowering, plants are transplanted to ~ 15 cm pots for seed production. 

A spray "batch" consists of several sets of Ri progenies all sprayed on 
the same date. Some batches may also include evaluations of other than Ri 
plants. Each batch also includes sprayed and unsprayed non-transgenic 
genotypes representing the genotypes in the particular batch which were 
putativeiy transformed. Also included in a batch is one or more non-segregating 
transformed genotypes previously identified as having some resistance. 

Two-six plants from each individual Rq progeny are not sprayed and 
serve as controls to compare and measure the glyphosate tolerance, as well as 
to assess any variability not induced by the glyphosate. When the other plants 
reach the 2-4 leaf stage, usually 10 to 20 days after planting, glyphosate is 
applied at rates varying from 0.28 to 1.12 kg/ha, depending on objectives of the 
study. Low rate technology using low volumes has been adopted. A laboratory 
track sprayer has been calibrated to deliver a rate equivalent to field 
conditions. 

A scale of 0 to 10 is used to rate the sprayed plants for vegetative 
resistance. The scale is relative to the unsprayed plants from the same R 0 
plant. A 0 is death, while a 10 represents no visible difference from the 
unsprayed plant. A higher number between 0 and 10 represents progressively 
less damage as compared to the unsprayed plant. Plants are scored at 7, 14, 
and 28 days after treatment (DAT), or until bolting, and a line is given the 
average score of the sprayed plants within an Ro plant family. 
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Six integers are used to qualitatively describe the degree of reproductive 
damage from glyphosate: 

0: No floral bud development 

2: Floral buds present, but aborted prior to opening 

4: Floors open, but no anthers, or anthers fail to extrude past 

petals 
6: Sterile anthers 
8: Partially sterile anthers 
10: Fully fertile flowers 

Plants are scored using this scale at or shortly after initiation of 
flowering, depending on the rate of floral structure development. 

Expression vi EPSPS in Canola 

After the 3 week period, the transformed canola plants were assayed for 
the presence of glyphosate-tolerant EPSPS activity (assayed in the presence 
of glyphosate at 0.5 mM). The results are shown in Table VIII. 
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Table VTTT Expression of CV4 EPSPS 1ti tr«Tififn,m»ri Cflnnin tf nrtn 

% resistant EPSPS activity 
of Leaf extract 
Plant # (at 0.5 mM glyphosate) 



Vector Control 




0 


pMON17110 


41 


47 


pMON17110 


' 52 


28 


pMON17110 


71 


82 


pMON17110 


104 


75 


pMON17110 


172 


84 


pMON17110 


177 


85 


pMON17110 


252 


29* 


pMON17110 


350 


49 


pMON17116 


40 


25 


P MON17116 


99 


87 


pMON17116 


175 


94 


pMON17116 


178 


43 


pMON17116 


182 


18 


pMON17116 


252 


69 


pMON17116 


298 


44* 


pMON17116 


332 


89 


P MON17116 


383 


97 


pMON17116 


395 


52 


*assayed in the 


presence of 1.0 mM glyphosate 





Ri transformants of canola were then grown in a growth chamber and 
sprayed with glyphosate at 0.56 kg/ha (kilogram/hectare) and rated 
vegetatively. These results are shown in Table DCA - IXC. It is to be noted that 
expression of glyphosate resistant EPSPS in all tissues is preferred to observe 
optimal glyphosate tolerance phenotype in these transgenic plants. In the 
Tables below, only expression results obtained with leaf tissue are described. 
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Table IXA Glvohosate tolerance in Class n EPSPS 

(pMON17110 = P-E35S; pMON17116 = P-FMV35S; Rl plants; 
Spray rate = 0.56 kg/ha) 

Vegetative 





% resistant 


Swre** 




Vector/Plant No. 


EPSPS* 


day 7 


day 14 


Control Westar 


0 


5 


3 


pMON17110/41 


47 


6 


7 


pMON17110/71 


82 


6 


7 


P MON17110/177 


85 


9 


10 


PMON17116/40 


25 


9 


9 


pMON17H6/99 


87 


9 


10 


P MON17116/175 


94 


9 


10 


pMON17116/178 


43 


6 


3 


pMON171 16/182 


18 


9 


10 


pMON17116/383 


97 


9 


10 
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Table IXB Glyphosate tolerance in Class n EPSPS 



(pMON17131 = P-FMV35S; Rl plants; Spray rate = 0.84 kg/ha) 



Vector/Plant No. Vegetative score** Reproductive score 
dav 14 dav 28 



17131/78 


10 


10 


17131/102 


9 


10 


17131/115 


9 


10 


17131/116 


9 


10 


17131/157 


9 


10 


17131/169 


10 


10 


17131/255 


10 


10 


control Westar 


1 


0 



Tahte IXC Cxlvnhosate tolerance in Class I EFSPS 
<?anola transfpnaants 

(P-E35S; R2 Plants; Spray rate = 0.28 kg/ha) 
Vegetative 



% resistant 

Vector/Plant No. EPSPS* day. 7, day H 

Control Westar 0 4 2 

pMON899/715 96 5 6 

pMON899/744 95 8 8 

P MON899/794 86 6 4 

pMON899/818 81 7 8 

pMON899/885 57 7 6 

* % resistant EPSPS activity in the presence of 0.5 mM glyphosate 
** A vegetative score of 10 indicates no damage, a score of 0 is given to a dead 
plant. 

The data obtained for the Class II EPSPS transformants may be 
>mpared to glyphosate- tolerant Class I EPSP transformants in which the 
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same promoter is used to express the EPSPS genes and in which the level of 
glyphosate-tolerant EPSPS activity was comparable for the two types of 
transformants. A comparison of the data of pMON17110 [in Table IXA] and 
pMON17131 [Table KB] with that for pMON899 [in Table IXC; the Class I 
gene in pMON899 is that from A. thaliana {Klee et al., 1987} in which the 
glycine at positioitlOl was changed to an alanine} illustrates that the Class II 
EPSPS is at least as good as that of the Class I EPSPS. An improvement in 
vegetative tolerance of Class II EPSPS is apparent when one takes into 
account that the Class II plants were sprayed at twice the rate and were 
tested as Ri plants. 

Example 3B 

The construction of two plant transformation vectors and the 
transformation procedures used to produce glyphosate-tolerant canola plants 
are described in this example The vectors, pMON17209 and pMON17237, 
were used to generate transgenic glyphosate-tolerant canola lines. The vectors 
each contain the gene encoding the 5-enol-pyruvyl-shikimate-3-phosphate 
synthase (EPSPS) from Agrobacterium sp. strain CP4. The vectors also 
contain either the gox gene encoding the glyphosate oxidoreductase enzyme 
(GOX) from Achromobacter sp. strain LBAA (Barry et al. , 1992) or the gene 
encoding a variant of GOX (GOX v.247) which displays improved catalytic 
properties. These enzymes convert glyphosate to aminomethylphosphonic 
acid and glyoxylate and protect the plant from damage by the metabolic 
inactivation of glyphosate. The combined result of providing an alternative, 
resistant EPSPS enzyme and the metabolism of glyphosate produces 
transgenic plants with enhanced tolerance to glyphosate 

Molecular biology techniques. In general, standard molecular biology 
and microbial genetics approaches were employed (Maniatis et aL, 1982). 
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Site-directed mutageneses were carried out as described by Kunkel et al. 
(1987). Plant-preferred genes were synthesized and the sequence confirmed. 

Plant transformation vectors. The following describes the general 
features of the plant transformation vectors that were modified to form 
vectors pMON17209 and pMON17237. The Agrobacterium mediated plant 
transformation vectors contain the following well-characterized DNA 
segments which are required for replication and function of the plasmids 
(Rogers and Klee, 1987; Klee and Rogers, 1989). The first segment is the 0.45 
kb Clal-Dral fragment from the pTil5955 octopine Ti plasmid which contains 
the T-DNA left border region (Barker et al , 1983). It is joined to the 0.75 kb 
origin of replication (oriV) derived from the broad-host range plasmid RK2 
(Stalker et al , 1981). The next segment is the 3.1 kb Sall-Pvul segment of 
pBR322 which provides the origin of replication for maintenance in E. coli and 
the bom site for the conjugational transfer into the Agrobacterium tumefaciens 
cells (Bolivar et al , 1977). This is fused to the 0.93 kb fragment isolated from 
transposon Tn7 which encodes bacterial spectinomycin and streptomycin 
resistance (Fling et al, 1985), a determinant for the selection of the plasmids in 
E. coli and Agrobacterium. It is fused to the 0.36 kb Pvul-Bcll fragment from 
the pTiT37 plasmid which contains the nopaline-type T-DNA right border 
region (Fraley et al, 1985). Several chimeric genes engineered for plant 
expression can be introduced between the Ti right and left border regions of the 
vector. In addition to the elements described above, this vector also includes 
the 35S promoter/NPTII/NOS 3' cassette to enable selection of transformed 
plant tissues on kanamycin (Klee and Rogers, 1989; Fraley et al., 1983; and 
Odell, et al.. 1985) within the borders. An "empty" expression cassette is also 
present between the borders and consists of the enhanced E35S promoter 
(Kay et al., 1987), the 3' region from the small subunit of RUBPcarboxylase of 
pea (E9) (Coruzzi et al.. 1984; Morelli et al, 1986), and a number of restriction 
enzyme sites that may be used for the cloning of DNA sequences for 
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expression in plants. The plant transformation system based on 
Agrobacterium tumefaciens delivery has been reviewed (Klee and Rogers, 1989; 
Fraley et al., 1986). The Agrobacterium mediated transfer and integration of 
the vector T-DNA into the plant chromosome results in the expression of the 
chimeric genes conferring the desired phenotype in plants. 

Bacterial Inoculum. The binary vectors are mobilized into Agrobacterium 
tumefaciens strain ABI by the triparental conjugation system using the helper 
plasmid pRK2013 (Ditta et al, 1980). The ABI strain contains the disarmed 
pTiC58 plasmid pMP90RK (Koncz and Schell, 1986) in the chloramphenicol 
resistant derivative of the Agrobacterium tumefaciens strain A208. 

Transformation procedure. Agrobacterium inocula were grown overnight 
at 28°C in 2 ml of LBSCK (LBSCK is made as follows: LB liquid medium [1 
liter volume] = 10 g NaCl; 5 g Yeast Extract;10 g tryptone; pH 7.0, and 
autoclave for 22 minutes. After autoclaving, add spectinomycin (50 mg/ml 
stock) - 2 ml, kanamycin (50 mg/ml stock) - 1 ml, and chloramphenicol (25 
mg/ml stock) - 1 ml.). One day prior to inoculation, the Agrobacterium was 
subcultured by inoculating 200ul into 2 ml of fresh LBSCK and grown 
overnight. For inoculation of plant material, the culture was diluted with MSO 
liquid medium to an Aeeo range of 0.2- 0.4. 

Seedlings of Brassica napus cv. Westar were grown in Metro Mix 350 
(Hummert Seed Co., St. Louis, Mo.) in a growth chamber with a day/night 
temperature of 15/10°C, relative humidity of 50%, 16h/8h photoperiod, and at 
a light intensity of 500 p.mol m -2 sec-L The plants were watered daily (via 
sub-irrigation) and fertilized every other day with Peters 15:30:15 (Fogelsville, 
Pa.). 



-77- 



38-21(10660)A 



In general, all media recipes and the transformation protocol follow 
those in Fry et al. (1987). Five to six week-old Westar plants were harvested 
when the plants had bolted (hut prior to flowering), the leaves and buds were 
removed, and the 4-5 inches of stem below the flower buds were used as the 
explant tissue source. Following sterilization with 70% ethanoi for 1 min and 
38% Clorox for 20 min, the stems were rinsed three times with sterile water 
and cut into 5 mm-long segments (the orientation of the basal end of the stem 
segments was noted). The plant material was incubated for 5 minutes with 
the diluted Agrobacterium culture at a rate of 5 ml of culture per 5 stems. 
The suspension of bacteria was removed by aspiration and the explants were 
placed basal side down - for an optimal shoot regeneration response - onto co- 
culture plates (1/10 MSO solid medium with a 1.5 ml TXD (tobacco xanthi 
diploid) liquid medium overlay and covered with a sterile 8.5 cm filter paper). 
Fifty-to-sixty stem explants were placed onto each co-culture plate. 

After a 2 day co-culture period, stem explants were moved onto MS 
medium containing 750 mg/1 carbenicillin, 50 mg/1 cefotaxime, and 1 mg/1 BAP 
(benzylaminopurine) for 3 days. The stem explants were then placed for two 
periods of three weeks each, again basal side down and with 5 explants per 
plate, onto an MS/0.1 mM glyphosate, selection medium (also containing 
carbenicillin, cefotaxime, and BAP (The glyphosate stock [0.5M] is prepared as 
described in the following: 8.45 g glyphosate [analytical grade] is dissolved in 50 
ml deionized water, adding KOH pellets to dissolve the glyphosate, and the 
volume is brought to 100 ml following adjusting the pH to 5.7. The solution is 
filter-sterilized and stored at 4°C). After 6 weeks on this glyphosate selection 
medium, green, normally developing shoots were excised from the stem 
explants and were placed onto fresh MS medium containing 750 mg/1 
carbenicillin, 50 mg/1 cefotaxime, and 1 mg/1 BAP, for further shoot 
development. When the shoots were 2-3 inches tall, a fresh cut at the end of 
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the stem was made, the cut end was dipped in Root-tone, and the shoot was 
placed in Metro Mix 350 soil and allowed to harden-off for 2-3 weeks. 

Construction of Canola transformation vector pMON17209. 

The EPSPS gene was isolated originally from Agrobacterium sp. strain CP4 
and expresses a highly tolerant enzyme. The original gene contains sequences 
that could be inimical to high expression of the gene in some plants. These 
sequences include potential polyadenylation sites that are often A+T rich, a 
higher G+C% than that frequently found in dicotyledonous plant genes (63% 
versus ~50%), concentrated stretches of G and C residues, and codons that 
may not used frequently in dicotyledonous plant genes. The high G+C% in the 
CP4 EPSPS gene could also result in the formation of strong hairpin structures 
that may affect expression or stability of the RNA. A plant preferred version 
of the gene was synthesized and used for these vectors. This coding sequence 
was expressed in E. coli from a PRecA-genelOL vector (Olins et aL, 1988) and 
the EPSPS activity was compared with that from the native CP4 EPSPS 
gene. The appK m for PEP for the native and synthetic genes was 11.8 uM and 
12.7 uM, respectively, indicating that the enzyme expressed from the 
synthetic gene was unaltered. The N-terminus of the coding sequence was 
then mutagenized to place an Sphl site (GCATGC) at the ATG to permit the 
construction of the CTP2-CP4 synthetic fusion for chloroplast import. This 
change had no apparent effect on the in vivo activity of CP4 EPSPS in E. coli 
as judged by complementation of the aroA mutant. A CTP-CP4 EPSPS fusion 
was constructed between the Arabidopsis thaliana EPSPS CTP (Klee et aL, 
1987) and the CP4 EPSPS coding sequences. The Arabidopsis CTP was 
engineered by site-directed mutagenesis to place a Sphl restriction site at the 
CTP processing site. This mutagenesis replaced the Glu-Lys at this location 
with Cys-Met. The CTP2-CP4 EPSPS fusion was tested for import into 
chloroplasts isolated from Lactuca sativa using the methods described 
previously (della-Cioppa et aL, 1986: 1987). 
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The GOX gene that encodes the glyphosate metabolizing enzyme 
glyphosate oxidoreductase (GOX) was cloned originally from Achromobacter sp. 
strain LBAA (Hallas et al ., 1988; Barry et al., 1992). The gox gene from strain 
LBAA was also resynthesized in a plant-preferred sequence version and in 
which many of the restriction sites were removed (PCT Appln. No. WO 
92/00377). The GOX protein is targeted to the plastids by a fusion between 
the C-terminus of a CTP and the N-terminus of GOX A CTP, derived from the 
SSU1A gene from Arabidopsis thaliana (Timko et al., 1988) was used. This 
CTP (CTP1) was constructed by a combination of site-directed mutageneses. 
The CTP1 is made up of the SSU1A CTP (amino acids 1-55), the first 23 
amino acids of the mature SSU1A protein (56-78), a serine residue (amino acid 
79), a new segment that repeats amino acids 50 to 56 from the CTP and the 
first two from the mature protein (amino acids 80-87), and an alanine and 
methionine residue (amino acid 88 and 89). An Ncdl restriction site is located 
at the 3' end (spans the Met89 codon) to facilitate the construction of precise 
fusions to the 5' of GOX. At a later stage, a Bglll site was introduced 
upstream of the N-terminus of the SSU1A sequences to facilitate the 
introduction of the fusions into plant transformation vectors. A fusion was 
assembled between CTP1 and the synthetic GOX gene. 

The CP4 EPSPS and GOX genes were combined to form pMON17209 as 
described in the following. The CTP2-CP4 EPSPS fusion was assembled and 
inserted between the constitutive FMV35S promoter (Gowda et al., 1989; 
Richins et al., 1987) and the E9 3' region (Coruzzi et al., 1984; Morelli et al, 
1985) in a pUC vector (Yannisch-Perron et al., 1985; Vieira and Messing, 1987) 
to form pMON17190; this completed element may then be moved easily as a 
Notl-Notl fragment to other vectors. The CTP1-GOX fusion was also 
assembled in a pUC vector with the FMV35S promoter. This element was 
then moved as a tfmdlll-BamHI fragment into the plant transformation 
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vector pMON10098 and joined to the E9 3' region in the process. The 
resultant vector pMON17193 has a single Notl site into which the 
FMV 35S/CTP2-CP4 EPSPS/E9 3' element from pMON17190 was cloned to 
form pMON17194. The kanamycin plant transformation selection cassette 
(Fraley et aL, 1985) was then deleted from pMON17194, by cutting with Xhol 
and re-ligating, to form the pMON17209 vector (Figure 24). 

Construction of Canola transformation vector pMON17237. 

TheGOX enzyme has an apparent Km for glyphosate [appK m (glyphosate)] of 
-25 mM. In an effort to improve the effectiveness of the glyphosate metabolic 
rate in planta, a variant of GOX has been identified in which the 
appK m (glyphosate) has been reduced approximately 10-fold; this variant is 
referred to as GOX v.247 and the sequence differences between it and the 
original plant-preferred GOX are illustrated in PCT Appln. No. WO 92/00377. 
The GOX v.247 coding sequence was combined with CTP1 and assembled with 
the FMV35S promoter and the E9 3' by cloning into the pMON17227 plant 
transformation vector to form pMON17241. In this vector, effectively, the 
CP4 EPSPS was replaced by GOX v.247. The pMON17227 vector had been 
constructed by replacing the CTP1-GOX sequences in pMON17193 with those 
for the CTP2-CP4 EPSPS, to form pMON17199 and followed by deleting the 
kanamycin cassette (as described above for pMON17209). The pMON17237 
vector (Figure 25) was then completed by cloning the FMV35S/CTP2-CP4 
EPSPS/E9 3' element as a NothNotl fragment into pMON17241. 
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Example 3 

Soybean plants were transformed with the pMON13640 (Figure 15) 
vector and a number of plant lines of the transformed soybean were obtained 
which exhibit giyphosate tolerance. 

Soybean plants are transformed with pMON13640 by the method of 
microprojectile injection using particle gun technology as described in Christou 
et al. (1988). The seed harvested from R Q plants is Ri seed which gives rise to 
Ri plants. To evaluate the giyphosate tolerance of an R 0 plant, its progeny are 
evaluated. Because an R 0 plant is assumed to be hemizygous at each insert 
location, selfing results in maximum genotypic segregation in the Ri. Because 
each insert acts as a dominant allele, in the absence of linkage and assuming 
only one hemizygous insert is required for tolerance expression, one insert 
would segregate 3:1, two inserts, 15:1, three inserts 63:1, etc. Therefore, 
relatively few R! plants need be grown to find at least one resistant phenotype. 

Seed from an R 0 soybean plant is harvested, and dried before planting in 
a giyphosate spray test. Seeds are planted into 4 inch (~5 cm) square pots 
containing Metro 350. Twenty seedlings from each Ro plant is considered 
adequate for testing. Plants are maintained and grown in a greenhouse 
environment. A 12.5-14 hour photoperiod and temperatures of 30°C day and 
24°C night is regulated. Water soluble Peters Pete Lite fertilizer is applied as 
needed. 

A spray batch" consists of several sets of Ri progenies all sprayed on 
the same date. Some batches may also include evaluations of other than Ri 
plants. Each batch also includes sprayed and unsprayed non-transgenic 
genotypes representing the genotypes in the particular batch which were 
putatively transformed. Also included in a batch is one or more non-segregating 
transformed genotypes previously identified as having some resistance. 

One to two plants from each individual R 0 progeny are not sprayed and 
serve as controls to compare and measure the giyphosate tolerance, as well as 
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to assess any variability not induced by the giyphosate. When the other plants 
reach the first trifoliate leaf stage, usually 2-3 weeks after planting, giyphosate 
is applied at a rate equivalent of 128 oz7acre (8.895 kg/ha) of Roundup®. A 
laboratory track sprayer has been calibrated to deliver a rate equivalent to 
those conditions. 

A vegetative score of 0 to 10 is used. The score is relative to the 
unsprayed progenies from the same R 0 plant. A 0 is death, while a 10 
represents no visible difference from the unsprayed plant. A higher number 
between 0 and 10 represents progressively less damage as compared to the 
unsprayed plant. Plants are scored at 7, 14, and 28 days after treatment 
(DAT). The data from the analysis of one set of transformed and control 
soybean plants are described on Table X and show that the CP4 EPSPS gene 
imparts giyphosate tolerance in soybean also. 



Table X Giy phosate to lerance in Class II EPSPS soybean 

transform ants 
(P-E35S, P-FMV35S; RO plants; Spray rate = 128 ozJacre) 



Victor/Plant No. 



dav 7 



Vegetative score 
dav 14 



dav 28 



13640/40-11 
13640/40-3 
13640/40-7 
control A5403 2 
control A5403 1 



10 

7 



7 
10 



Tframpte 4. 

The CP4 EPSPS gene may be used to select transformed plant material 
directly on media containing giyphosate. The ability to select and to identify 
transformed plant material depends, in most cases, on the use of a dominant 
selectable marker gene to enable the preferential and continued growth of the 
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transformed tissues in the presence of a normally inhibitory substance. 
Antibiotic resistance and herbicide tolerance genes have been used almost 
exclusively as such dominant selectable marker genes in the presence of the 
corresponding antibiotic or herbicide. The nptll/kanamycin selection scheme is 
probably the most frequently used. It has been demonstrated that CP4 
EPSPS is also a useful and perhaps superior selectable marker/selection 
scheme for producing and identifying transformed plants. 

A plant transformation vector that may be used in this scheme is 
pMON17227 (Figure 16). This plasmid resembles many of the other plasmids 
described infra and is essentially composed of the previously described 
bacterial replicon system that enables this plasmid to replicate in E. coli and 
to be introduced into and to replicate in Agrobacterium, the bacterial selectable 
marker gene (Spc/Str), and located between the T-DNA right border and left 
border is the CTP2-CP4 synthetic gene in the FMV35S promoter-E9 3' 
cassette. This plasmid also has single sites for a number of restriction 
enzymes, located within the borders and outside of the expression cassette. 
This makes it possible to easily add other genes and genetic elements to the 
vector for introduction into plants. 

The protocol for direct selection of transformed plants on glyphosate is 
outlined for tobacco. Explants are prepared for pre-culture as in the standard 
procedure as described in Example 1: surface sterilization of leaves from 1 
month old tobacco plants (15 minutes in 10% clorox + surfactant; 3X dH 2 0 
washes); explants are cut in 0.5 x 0.5 cm squares, removing leaf edges, mid-rib, 
tip, and petiole end for uniform tissue type; explants are placed in single layer, 
upside down, on MS104 plates + 2 ml 4C005K media to moisten surface; pre- 
culture 1-2 days. Explants are inoculated using overnight culture of Agrobac- 
terium containing the plant transformation plasmid that is adjusted to a titer 
of 1.2 X 109 bacteria/ml with 4C005K media. Explants are placed into a 
centrifuge tube, the Agrobacterium suspension is added and the mixture of 
bacteria and explants is "Vortexed" on maximum setting for 25 seconds to 
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ensure even penetration of bacteria. The bacteria are poured off and the 
explants are blotted between layers of dry sterile filter paper to remove excess 
bacteria. The blotted explants are placed upside down on MS104 plates + 2ml 
4C005K media + filter disc. Co-culture is 2-3 days. The explants are 
transferred to MS 104 + Carbenicillin 1000 mg/1 + cefotaxime 100 mg/1 for 3 
days (delayed p&ase). The explants are then transferred to MS104 + 
glyphosate 0.05 mM + Carbenicillin 1000 mg/1 + cefotaxime 100 mg/1 for 
selection phase. At 4-6 weeks shoots are cut from callus and placed on MSO + 
Carbenicillin 500 mg/1 rooting media. Roots form in 3-5 days, at which time leaf 
pieces can be taken from rooted plates to confirm glyphosate tolerance and 
that the material is transformed. 

The presence of the CP4 EPSPS protein in these transformed tissues 
has been confirmed by immunoblot analysis of leaf discs. The data from one 
experiment with pMON17227 is presented in the following: 139 shoots formed 
on glyphosate from 400 explants inoculated with Agrobacterium 
ABI/pMON17227; 97 of these were positive on recallusing on glyphosate. 
These data indicate a transformation rate of 24 per 100 explants, which 
makes this a highly efficient and time saving transformation procedure for 
plants. Similar transformation frequencies have been obtained with 
pMON17131 and direct selection of transformants on glyphosate with the CP4 
EPSPS genes has also been shown in other plant species, including, 
Arabidopsis, soybean, corn, wheat, potato, tomato, cotton, lettuce, and 
sugarbeet. 

The pMON17227 plasmid contains single restriction enzyme recognition 
cleavage sites (NotI, Xhol, and BstXI ) between the CP4 glyphosate selection 
region and the left border of the vector for the cloning of additional genes and to 
facilitate the introduction of these genes into plants. 
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Esampjg 5A 

The CP4 EPSPS gene has also been introduced into Black Mexican 
Sweet (BMS) corn cells with expression of the protein and glyphosate 
resistance detected in callus. 

The backbone for this plasmid was a derivative of the high copy plasmid 
pUC119 (Viera and Messing, 1987). The 1.3 Kb Fspl-Dral pUC119 fragment 
containing the origin of replication was fused to the 1.3 Kb Smal-Hindlll filled 
fragment from pKC7 (Rao and Rogers, 1979) which contains the neomycin 
phosphotransferase type II gene to confer bacterial kanamycin resistance. 
This plasmid was used to construct a monocot expression cassette vector 
containing the 0.6 kb cauliflower mosaic virus (CaMV) 35S RNA promoter 
with a duplication of the -90 to -300 region (Kay et al., 1987), an 0.8 kb 
fragment containing an intron from a maize gene in the 5' untranslated leader 
region, followed by a polylinker and the 3' termination sequences from the 
nopaline synthase (NOS) gene (Fraley et al., 1983). A 1.7 Kb fragment 
containing the 300 bp chloroplast transit peptide from the Arabidopsis EPSP 
synthase fused in frame to the 1.4 Kb coding sequence for the bacterial CP4 
EPSP synthase was inserted into the monocot expression cassette in the 
polylinker between the intron and the NOS termination sequence to form the 
plasmid pMON19653 (Figure 17). 

pMON19653 DNA was introduced into Black Mexican Sweet (BMS) cells 
by co-bombardment with EC9, a plasmid containing a sulfonylurea-resistant 
form of the maize acetolactate synthase gene. 2.5 mg of each plasmid was 
coated onto tungsten particles and introduced into log-phase BMS cells using a 
PDS-1000 particle gun essentially as described (Klein et al., 1989). 
Transformants are selected on MS medium containing 20 ppb chlorsulfuron. 
After initial selection on chlorsulfuron, the calli can be assayed directly by 
Western blot. Glyphosate tolerance can be assessed by transferring the calli to 



-86- 



38-21(10660)A 



medium containing 5mM glyphosate. As shown in Table XI, CP4 EPSPS 
confers glyphosate tolerance to corn callus. 

Table XL Expression of CP4 fn BMS Corn Callus . nMOM 10^3 



Lffle. CP4 expression 

i (%^xfracft?ti protein) 

284 0.006 % 

287 0.036 

290 0.061 

295 0.073 

299 0.113 

309 0.042 

313 0.003 



To measure CP4 EPSPS expression in corn callus, the following 
procedure was used: BMS callus (3 g wet weight) was dried on filter paper 
(Whatman#l) under vacuum, reweighed, and extraction buffer (500 (il/g dry 
weight; 100 mM Tris, 1 mM EDTA, 10% glycerol) was added. The tissue was 
homogenized with a Wheaton overhead stirrer for 30 seconds at 2.8 power 
setting. After centrifugation (3 minutes, Eppendorf microfuge), the 
supernatant was removed and the protein was quantitated (BioRad Protein 
Assay). Samples (50 ng/well) were loaded on an SDS PAGE gel (Jule, 3-17%) 
along with CP4 EPSPS standard (10 ng), electrophoresed, and transferred to 
nitrocellulose similarly to a previously described method (Padgette, 1987). The 
nitrocellulose blot was probed with goat anti-CP4 EPSPS IgG, and developed 
with 1-125 Protein G. The radioactive blot was visualized by autoradiography. 
Results were quantitated by densitometry on an LKB UltraScan XL laser 
densitomer and are tabulated below in Table X. 
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Table XQ. Glvphosate resistance in BMS Corn Callus 
using pMON 19653 

Victor Experiment # chlorsulfiiron- # cross-resistant 

resistant lines to Glvphosate 

19653 253 120 81/120 = 67.5% 

19653 254 80 37/80 = 46% 

EC9 control 253/254 8 0/8 = 0% 



Improvements in the expression of Class II EPSPS could also be 
achieved by expressing the gene using stronger plant promoters, using better 3' 
polyadenylation signal sequences, optimizing the sequences around the 
initiation codon for ribosome loading and translation initiation, or by 
combination of these or other expression or regulatory sequences or factors. 



Example 5B 

The plant- expressible genes encoding the CP4 EPSPS and a glyphosate 
oxidoreductasease enzyme (PCT Pub. No. WO92/00377) were introduced into 
embryogenic corn callus through particle bombardment. Plasmid DNA was 
prepared using standard procedures (Ausubel et ai., 1987), cesium-chloride 
purified, and re-suspended at 1 mg/ml in TE buffer. DNA was precipitated onto 
M10 tungsten or 1.0|i gold particles (BioRad) using a calcium 
chloride/spermidine precipitation protocol, essentially as described by Klein et 
al. (1987). The PDS1000® gunpowder gun (BioRad) was used. Callus tissue 
was obtained by isolating 1-2 mm long immature embryos from the "Hi-II" 
genotype (Armstrong et al., 1991), or Hi-II X B73 crosses, onto a modified N6 
medium (Armstrong and Green, 1985; Songstad et al., 1991). Embryogenic 
callus ("type-II"; Armstrong and Green, 1985) initiated from these embryos 
was maintained by subculturing at two week intervals, and was bombarded 
when less than two months old. Each plate of callus tissue was bombarded 
from 1 to 3 times with either tungsten or gold particles coated with the plasmid 
DNA(s) of interest. Callus was transferred to a modified N6 medium 
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containing an appropriate selective agent (either glyphosate, or one or more of 
the antibiotics kanamycin, G418, or paromomycin) 1-8 days following 
bombardment, and then re-transferred to fresh selection media at 2-3 week 
intervals. Glyphosate-resistant calli first appeared approximately 6-12 weeks 
post-bombardment. These resistant calli were propagated on selection 
medium, and samples were taken for assays gene expression. Plant 
regeneration from resistant calli was accomplished essentially as described by 
Petersen et al. (1992). 

In some cases, both gene(s) were covalently linked together on the same 
plasmid DNA molecule. In other instances, the genes were present on 
separate plasmids, but were introduced into the same plant through a process 
termed "co-transformation". The 1 mg/ml plasmid preparations of interest 
were mixed together in an equal ratio, by volume, and then precipitated onto 
the tungsten or gold particles. At a high frequency, as described in the 
literature (e.g., Schocher et al, 1986), the different plasmid molecules integrate 
into the genome of the same plant cell. Generally the integration is into the 
same chromosomal location in the plant cell, presumably due to recombination 
of the plasmids prior to integration. Less frequently, the different plasmids 
integrate into separate chromosomal locations. In either case, there is 
integration of both DNA molecules into the same plant cell, and any plants 
produced from that cell. 

Transgenic corn plants were produced as decribed above which 
contained a plant-expressible CP4 gene and a plant-expressible gene encoding 
a glyphosate oxidoreductase enzyme. 

The plant-expressible CP4 gene comprised a structural DNA sequence 
encoding a CTP2/CP4 EPSPS fusion protein. The CTP2/CP4 EPSPS is a gene 
fusion composed of the N-terminal 0.23 Kb chloroplast transit peptide 
sequence from the Arabidopsis thaliana EPSPS gene (Klee era/. 1987, referred 
to herein as CTP2), and the C-terminal 1.36 Kb 5-enolpyruvylshikimate-3- 
phosphate synthase gene (CP4) from an Agrobacterium species. Plant 
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expression of the gene fusion produces a pre-protein which is rapidly imported 
into chloroplasts where the CTP is cleaved and degraded (della-Cioppa et aL, 
1986) releasing the mature CP4 protein. 

The plant-expressible gene expressing a glyphosate oxidoreductase 
enzyme comprised a structual DNA sequence comprising CTPl/GOXsyn gene 
fusion composed of the N-terminal 0.26 Kb chloroplast transit peptide 
sequence derived from the Arabidopsis thaliana SSU la gene (Timko et aL, 
1988 referred to herein as CTP1), and the C-terminal 1.3 Kb synthetic gene 
sequence encoding a glyphosate oxidoreductase enzyme (GOXsyn, as descibed 
in PCT Pub. No. WO92/00377 previously incorporated by reference). The 
GOXsyn gene encodes the enzyme glyphosate oxidoreductase from an 
Achromobacter sp. strain LBAA which catalyzes the conversion of glyphosate 
to herbicidally inactive products, aminomethylphosphonate and glyoxylate. 
Plant expression of the gene fusion produces a pre-protein which is rapidly 
imported into chloroplasts where the CTP is cleaved and degraded (della- 
Cioppa et aL, 1986) releasing the mature GOX protein. 

Both of the above described genes also include the following regulatory 
sequences for plant expression: (i) a promoter region comprising a 0.6 Kb 35S 
cauliflower mosaic virus (CaMV) promoter (Odell et aL, 1985) with the 
duplicated enhancer region (Kay et aL, 1987) which also contains a 0.8 Kb 
fragment containing the first intron from the maize heat shock protein 70 gene 
(Shah et aL, 1985 and PCT Pub. No. W093/19189, the disclosure of which is 
hereby incorporated by reference); and (ii) a 3' non-translated region 
comprising a 0.3 Kb fragment of the 3' non-translated region of the nopaline 
synthase gene iFraley et aL, 1983 and Depicker, et aL, 1982) which functions 
to direct polyadenylation of the mRNA. 

The above described transgenic corn plants exhibit tolerance to 
glyphosate herbicide in greenhouse and field trials. 
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Example 6, 

The LBAA Class II EPSPS gene has been introduced into plants and 
also imparts glyphosate tolerance. Data on tobacco transformed with 
pMON17206 (infra) are presented in Table XIII. 



From the foregoing, it will be recognized that this invention is one well 
adapted to attain all the ends and objects hereinabove set forth together with 
advantages which are obvious and which are inherent to the invention. 
It will be further understood that certain features and subcombinations are of 
utility and may be employed without reference to other features and 
subcombinations. This is contemplated by and is within the scope of the 
claims. Since many possible embodiments may be made of the invention 
without departing from the scope thereof, it is to be understood that all matter 
herein set forth or shown in the accompanying drawings is to be interpreted as 
illustrative and not in a limiting sense. 



Table Xm . Tobacco Obmhosate Sorav Test 
(PMON17206: E35S ■ CTP2.T,RAA EPSPS: 0.4 lbs/an) 

Um 7 Day Rating 



33358 
34586 
33328 
34606 
33377 
34611 
34607 
34601 
34589 



9 
9 
9 
9 
9 

10 

10 

9 

9 

4 



Samsun (Control) 



-91- 



38-21(10660)A 



BTBTJOftRAPTTY 

Aim, RA, Dalrymple, B.P. and Mattick, J.S. 1994. Sequencing and expression 
of the aroA gene from Dichelobacter nodosus, Gene . 145: 97-101. 

Alton, N.K and Vapnek, D. (1979) Nature 282:864-869. 

Ammirato, P.V., et aL Handbook o f Plant Cell Culture - Crop Species . 
Macmillan Publ. Co. (1984). 

Armstrong, C.L., and Green, C.E. 1985. Establishment and maintenance of 
friable, embryogenic maize callus and the involvement of L-proline. 
Planta 164:207-214. 

Armstrong, C.L., Green, C.E., and Phillips, R.L. 1991. Development and 
availability of germplasm with high Type II culture formation response. 
Maize Genetics Cooperation NewsLetter 65:92-93. 

Arnon, D.I. Plant Phvsiol. 24:1-15 (1949). 

Ausubel, F.M., Brent, R., Kingston. R.E., Moore, D.D., Seidman, J.G., Smith, 
J.A., and Struhl, K 1987. CURRENT PROTOCOLS IN MOLECULAR 
BIOLOGY, John Wiley and Sons, Inc. New York. 

Bachmann. B. J. et aL, Microb. Rev. . 44:1-56 (1980). 

Barker, R.. Idler. K., Thompson. D.. and Kemp, J. (1983) Nucleotide sequence 
of the T-DNA region from the Agrobacterium tumefaciens Ti plasmid pTil5955. 
Plant Mol Biol 2: 335-350 



-92- 



38-21(10660)A 



Barry, G., Kishore, G., Padgette, S., Taylor, M., Kolacz, K, Weldon, M., Re D., 
Eichholtz., Fincher, K, and Hallas, L. (1992) Inhibitors of amino acid 
biosynthesis: Strategies for imparting glyphosate tolerance to crop plants. In: 
Biosynthesis and Molecular Relation nf Amino Acids in Plants, pp. 139-145. 
[Edited by Singh, §. K, Flores, H. E., and Shannon, J. C] American Society of 
Plant Physiologists, Rockville, Maryland. 

Bartlett, S.G., Grossman, A.R., and Chtia, N.H. (1982) in Methods in 
Chloronlast Molecular Biology, pp. 1081-1091. M. Edelman, R.B., Hallick, and 
Chua, N.H.,eds. 

Bevan, M. (1984) Nucleic Acids Res. 12 (22): 8711-8721. 

Birnboim, H. C. and Doly, J. (1979) A rapid alkaline extraction procedure for 
screening recombinant plasmid DNA. 
Nucl. Acids. Res . 7:1513-1525. 

Bolivar, F., Rodriguez, R. L., Greene, P. J., Betlach, M. B., Heynecker, H. L., 
Boyer,H.W., Crosa, J. H., and Falkow, S. (1977) Construction and 
characterization of new cloning vehicles, II. A multi-purpose cloning system. 
Gene 2: 95-113. 

Boyer, H. W. and Rolland-Dussoix, D. (1969) A complementation analysis of 
the restriction and modification of DNA in Escherichia coli. J. Mol. Bjpl, 41:459. 

Carrer, H., Hockenberry, T. N., Svab, Z., and Maliga, P. (1993) Kanamycin 
resistance as a selectable marker for plastid transformation in tobacco. Mol. 
Gen. Genet. 241: 49-56. 



-93- 



38-21(10660)A 



Christou, P., D. E. McCabe, and W.F. Swain (1988) Stable transformation of 
Soybean Callus by DNA-Coated Gold Particles. Plant Phvsiol. 87:671-674. 

Coruzzi, G., Broglie, R., Edwards, C, and Chua, N.H. (1984). Tissue-specific 
and light-regulated expression of a pea nuclear gene encoding the small subunit 
of ribulose-l,5-bisphosphate carboxylase. EMBO J 3:1671. 

Dalla Chiesa, M., Mayes, S.R., Maskell, D.J., Nixon, P.J. and Barber, J. 1994 
An AroA homologue from Synechocystis sp. PCC6803, Gene . 144: 145-146. 

della-Cioppa, G., Bauer, S. C, Klein, B. K, Shah, D. M., Fraley, R. T. and 
Kishore G. K. (1986) Translocation of the precursor of 5-enoZpyravylshikimate- 
3-phosphate synthase into chloroplasts of higher plants in vitro. Proc. Natl. 
Acad Sci. USA 83: 6873-6877. 

della-Cioppa, G., Bauer, S. C., Taylor, M. T., Rochester, D. E., Klein, B. K., 
Shah, D. M., Fraley, R. T. and Kishore G. M. (1987) Targeting a herbicide- 
resistant enzyme from Escherichia coli to chloroplasts of higher plants. 
Bio/Technology 5: 579-584. 

Depicker, A., Stachel, S., Dhaese, P., Zambryski, P., and Goodman, H.M. 1982. 
Nopaline Synthase: Transcript Mapping and DNA Sequence. J. MOLEC. 
APPL. GENETICS 1:561-573. 

Devereux, J., Haeberli, P. and Smithies, O. (1984) A comprehensive set of 
sequence analysis programs for the VAX. 
Nnrl. Acids. Res. 12:387-395. 



-94- 



38-21(10660)A 



Ditta, G., Stanfield, S., Corbin, D., and Helinski, D.R. (1980) Broad host range 
DNA cloning system for Gram-Negative bacteria: construction of a gene bank 
oiRhizobium meliloti. Proc Natl Acad Sci USA 77, 7347-7351. 

Duncan, K, Edwards, R.M., Coggins, J.R. (1987) The pentafunctional aroM 
enzyme of Saccharomyces cerevisiae is a mosaic of monofunctional domains. 
Biochem. J. 246: 375-386. 

Dunn, J.J. and Studier, F.W., (1983) J. Mol. Biol. 166:477-535. 

Fitzgibbon, J. E. (1988) Pseudomonas sp. strain PG2982: uptake of glyphosate 
and cloning of a gene which confers increasedresistance to glyphosate. Ph. D. 
Dissertation, Louisiana State University. 

Fitzgibbon, E. F. and Braymer, H. D. (1990) Cloning of a gene from 
Pseudomonas sp. PG2982 conferring increased glyphosate resistance AbeL 
Environ. Microbiol. 56: 3382-3388. 

Fling, M.E.. Kopf, J., and Richards, C. (1985). Nucleotide sequence of the 
transposon Tn7 gene encoding an aminoglycoside-modifying enzyme, 
3"(9)-0-nucleotidyltransferase. Nycteic Acids Res. 13 no. 19, 7095-7106. 

Fraley, R.T., Rogers, S.G., Horsch, R.B., Sanders, P.R. Flick, J.S., Adams, S.P., 
Bittner, M.L.. Brand, LA., Fink, C.L., Fry, J.S., Galluppi, G.R., Goldberg, S.B., 
Hoffman, N.L., and Woo. S.C. 1983. Expression of bacterial genes in plant 
cells. Prnc- Natl. Acad. Sci. USA 80:4803-4807. 



-95- 



38-21(10660)A 



Fraley, R. T., Rogers, S. G., Horsch, R. B., Eichholtz D. A., Flick, J. S., Fink, C. 
L., Hoffinann, N. L. and Sanders, P. R. (1985) The SEV system: a new 
disarmed Ti plasmid vector system for plant transformation. 
Bio/Technology 3: 629-635. 

Fromm, M., (1990) UCLA Symposium on Molecular Strategies for Crop 
Improvement, April 16-22, 1990. Keystone, CO. 

Fry J., Barnason A., and Horsch R. (1987) Plant Cell Reports 6: 321-325. 

Gasser, C. S., Winter, J. A., Hironaka, C. M. and Shah, D. M. (1988) Structure, 
expression, and evolution of the 5-enolpyruvylshikimate 3-phosphate synthase 
genes of petunia and tomato. J. Biol. Chem . 263: 4280-4289. 

Gowda, S., Wu, F.C., and Shepard, R.J. (1989). Identification of promoter 
sequences for the major RNA transcripts of figwort mosaic and peanut 
chlorotic streak viruses (caulimovirus group). 
Journal of Cellular Biochemistry supplement 13D, 301 (Abstract). 

Hallas, L. E., Hahn, E. M. and Korndorfer, C. (1988) Characterization of 
microbial traits associated with glyphosate biodegradation in industrial 
activated sludge. J. Industrial Microbiol . 3: 377-385. 

Hayford, M. B., Medford, J. L, Hoffinann, N. L., Rogers, S. G. and Klee, H. J. 
(1988) Development of a plant transformation selection system based on 
expression of genes encoding gentamicin acetyltransferases. Plant PfrysipL 86: 
1216-1222. 



Herrera-Estrella, L., et al. (1983) Nature 303:209 



-96- 



38-24(10660)A 



Heitkamp, M. A., Hallas, L. and Adams, W. J. (1990) Biotreatment of 
industrial wastewater with immobilized microorganisms - Presented in Session 
11, Paper S40, Society for Industrial Microbiology Annual Meeting, Orlando, 
Florida, July 29-August 3, 1990. 

Henner, J. H., Band, L. and Shimotsu, H. (1984) Nucleotide sequence of the 
Bacillus subtilis tryptophan operon. Gene . 34: 169-177. 

Henner, J. H., Band, L., Flaggs, G. and Chen, E. (1986) The organization and 
nucleotide sequence of the Bacillus subtilis hisH, tyrA and aroE genes 
Gene 49: 147-152. 

Hohn, B. and Collins J. (1980) A small cosmid for efficient cloning of large DNA 
fragments. Gene 11: 291-298. 

Horsch, R.B. and H. Klee. (1986) Prnr. Natl. Aca d. Sci. U.S.A. 83:4428-32. 

Hunkapiller. M. W., Hewick, R. M., Dreyer, R. J., and Hood, L. (1983) 
Methods Enzvmol. 91. 399-413. 

Jefferson, R.A., Kavanaugh, T.A. and Bevan, M.W., 1987, 
EMBO J. . 6:3901-3907. 

Kay, R., Chan, A., Daly, M. and McPherson, J. 1987. Duplication of the CaMV 
35S promoter sequence creates a strong enhancer for plants. 
Science 236, 1299-1302. 



-97- 



38-21(10660)A 



Kishore, G., Shah, D., Padgette, S., della-Cioppa, G., Gasser, C, Re, D., 
Hironaka, C, Taylor, M., Wibbenmeyer, J., Eichholtz, D., Hayford, M., 
Hofiman, N., Delannay, X., Horsch, R., Klee, H., Rogers, S., Rochester, D., 
Brundage, L., Sanders, P. and Fraley, R. T. (1988) 5-EnolpyruvylshiMmate 3- 
phosphate synthase: From Biochemistry to genetic engineering of glyphosate 
tolerance, in Biotechnology for Crop Protection ACS Symposium series No. 
379. Eds. Hedlin P. A., Menn, J. J. and Hollingsworth, R. M. pp. 37-48. 

Kishore, G. and Shah, D. (1988) Ann. Rev. Biochem. 57:627-663. 

Kishore, G. M., Brundage, L., Kolk, K, Padgette, S. R., Rochester, D., Huynh, Q. 
K. and della-Cioppa, G. (1986) Fed. Proc. 45: 1506. 

Klee, H.J., et al (1985) Bio/Technology 3:637-42. 

Klee, H. J., Muskopf, Y. M. and Gasser, C. S. (1987) Cloning of axiArabidopsis 
thaliana gene encoding 5-enolpyruvyl-shikimate-3-phosphate synthase: 
sequence analysis and manipulation to obtain glyphosate-tolerant plants. 
Mnl. Gen. Genet. 210: 437-442. 

Klee, H. J. and Rogers, S. G. (1989) Plant gene vectors and genetic 
transformation: plant transformation systems based on the use of 
Agrobacteriumtumefaciens in- (VII Culture and Somatic Cell; Genetics of 
Plants eds J. Schell and I.K. Vasil . 6: 1-23. 

Klein, T.M.. Kornstein. L.. Sanford, J.C., and Fromm, M.E. 1989. Genetic 
transformation of maize cells by particle bombardment. 
Plant Phvs. 91:440-444. 



-98- 



38-21(10660)A 



Koncz, C. and Schell, J. (1986) The promoter of Tl-DNA gene 5 controls the 
tissue-specific expression of chimeric genes carried by a novel type of 
Agrobacterium binary vector. MoL Gen. Genet. 204:383-396. 

Kunkel,T. A., Roberts, J. D. and Zakour, R. A. (1987) Rapid and efficient site- 
specific mutagenesis without phenotypic selection. 
Methods Easvmol 154:367. 

Laemmli, U.K (1970), "Cleavage of structural proteins during the assembly of 
the head of the bacteriophage T4" Nature . 227:680. 

Maliga, P., Carrer, H., Kanevski, I., Staub, J., and Svab, Z. (1993) Plastid 
engineering in land plants: a conservative genome is open to change. Philos. 
Trans. R. Soc. London B Biol. Sci. 342: 203-208. 

Maniatis, T., Fritsch, E. F. and Sambrook, J. (1982) Molecular Cloning: a 
laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, New- 
York. 

Maskell, D.J., Morrissey, P. and Dougan, G. (1988) Cloning and nucleotide 
sequence of the aroA gene of Bordetella pertussis. 
J. Bacteriol. 170:2467-2471. 

Miller, J. H. (1972). Experiments in Molecular Genetics. Cold Spring Harbor 
Laboratory, Cold Spring Harbor, New York. 

Moore, J. K, Braymer, H. D. and Larson, A. D. (1983) Isolation of a 
Pseudomonas sp. which utilizes the phosphonate herbicide glyphosate. Appl t 
Environ. Microbiol. 46: 316-320. 



-99- 



38-21(10660)A 



Morelli, G., Nagy, F., Fraley, R.T., Rogers, S.G., and Chua, N. H. (1985). A short 
conserved sequence is involved in the Ught-mducibility of a gene encoding 
ribulose 1,5-bisphosphate carboxylase small subunit of pea. 
Iiato£315, 200-204. 

O'Connell, C, Pattee, P. A. and Foster, T. J. (1993) Sequence and mapping of 
the aroA gene of Staphylococcus aureus 8325-4. 
J. Gen. Micr. 139: 1449-1460. 

Odell, J.T., Nagy, F., and Chua, N.H. (1985). Identification of DNA sequences 
required for activity of the cauliflower mosaic virus 35S promoter. 
Nature 313. 810-812. 

Olins, P. O., Devine, C. S., Rangwala, S. H. and Kavka, K S. 
(1988) Gene 73: 227-235. 

O'Neill, C, Horvath, G. V., Horvath, E., Dix, P. J. and Medgyesy, P. (1993) 
Chloropiast transformation in plants: polyethylene glycol (PEG) treatment of 
protoplasts is an alternative to biolistic delivery systems. Plant J. 3: 729-738. 

Padgette, S. R., Huynh, Q. K, Borgmeyer, J., Shah, D. M., Brand, L. A., Re, D. 
B., Bishop, B. F., Rogers. S. G., Fraley, R. T., and Kishore, G. (1987) Bacterial 
expression and isolation of Petunia hybrida 5-eno/-pyruvylshikimate-3- 
phosphate synthase. Arch. Bioch em. Biophvs. 258, 564-573. 



Padgette, S. R., Huynh. Q. K, Aykent, S., Sammons, R. D., Sikorski, J. A., and 
Kishore, G. M. (1988) J. Biol. Chem. 263, 1798-1802. 



-100- 



38-21(10660)A 



Petersen, W.L., Sulc, S., and Armstrong, C.L. 1992. Effect of nurse cultures 
on the production of macro-calli and fertile plants from maize embryogenic 
suspension protoplasts. Plant Cell Reports 10:591-594. 

Quinn, J. P., Peden, J. M. M. and Dick, E. (1988) Glyphosate tolerance and 
utilization by the microflora of soils treated with the herbicide. 
Appl. Microbiol. Biotechnol.2 9: 511-516. 

Rao, R.N. and Rogers, S.G. (1979). Plasmid pKC7: A vector containing ten 
restriction endonuclease sites suitable for cloning DNA segments. Gene 7:79. 

Richins, R. D., Scholthof, H. B., and Shepard, R. J. (1987) Sequence of the 
figwort mosaic virus DNA (caulimovirus group). 
NwLAfflk 3w. 15: 8451-8466. 

Rogers, S.G., Brand, L.A Holder, S.B. Sharps, E.S. and Brackin, M.J. (1983) 
Amplification of the aroA gene from E. coli results in tolerance to the herbicide 
glyphosate. Appl. Environ. Microbiol. 46:37-43. 

Rogers, S.G. and Klee, H. J. (1987). "Pathways to genetic manipulation 
employing Agrobacterium." in Plant Gene Research . Plant DNA Infectious 
Agents . Vol IV, Hohn, T. and Schell, J., eds. Springer-Verlag, Vienna, 
pp. 179-203. 

Sambrook. J., Fritsch, E.F. and Maniatis, T., (1989) 

Mnlftcular Cloning: A Laboratory Manual . Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, New York. 



-101- 



38-21(10660)A 



Schocher, R.J., Shillito, R.D., Saul, M.W., PaszkowsM, J., and Potrykus, I. 
(1986). Co-transformation of unlinked foreign genes into plants by direct gene 
transfer. Bio/Technology 4:1093-1097. 

Songstad, D.D., Armstrong, C.L., and Petersen, W.L. (1991). AgNC>3 increases 
type II callus production from immature embryos of maize inbred B73 and its 
derivatives. Plant Cell Reports 9: 699-702. 

Schuler, M. A., Schmitt, E. S. and Beachy, R.N. (1982) 
Nucieip Acids Res. 10:8225-8244. 

Schulz, A, Kruper, A. and Amrhein, N. (1985) Differential sensitivity of 
bacterial 5-enolpyruvylshikimate-3-phosphate synthases to the herbicide 
glyphosate. FEMS Microbiol. Lett. 28: 297-301. 

Schulz, A., Sost, D. and Amrhein, D. (1984) Arch. Microbiol. 137: 121-123. 

Shah, D., Horsch, R., Klee, H., Kishore, G., Winter, J., Turner, N., Hironaka, C, 
Sanders, P., Gasser, C, Aykent, S., Siegal, N., Rogers, S.„ and Fraley, R. 
(1986). Engineering herbicide tolerance in transgenic plants. 
Science 233, 478-481. 

Shah, D.M., Rochester, D.E., Krivi, G., Hironaka, C, Mozer, T.J., Fraley, R.T., 
and D.C. Tiemeier. 1985. Structure and expression of the maize hsp70 gene. 
Ofill. and Mol. Biol, of Plant Stress. Alan R. Liss, Inc. pp. 181-200. 

Shimamoto, K. et al. (1989) Nature 338:274-276. 



Sost, D., Schulz, A. and Amrhein. N. (1984) FEBS Lett. 173: 238-241. 



-102- 



38-21(10660)A 



Sost, D. and Amrhein, N. (1990) Substitution of Gly-96 to Ala in the 5- 
enolpyravylshikimate 3-phosphate synthase of Klebsiella pneumoniae results 
in greatly reduced affinity for the herbicide glyphosate. 
,282:433-436. 



Stalker, D.M., Thomas, CM., and Helinski, D.R. (1981). Nucleotide sequence of 
the region of the origin of replication of the broad host range plasmid RK2. 
Molfrffl Qenet 181: 8-12. 

Stalker, D. M., Hiatt, W. R. and Comai, L. (1985) A single amino acid 
substitution in the enzyme 5-enolpyruvylshikimate 3-phosphate synthase 
confers resistance to glyphosate. J. Biol. Chem. 260: 4724-4728. 

Stallings, W. C. , Abdel-Meguid, S. S., Lim, L. W., Shieh, Huey-Sheng, 
Dayringer, H. E., Leimgruber, N. K., Stegeman, R. A., Anderson, K. S., 
Sikorski, J. A., Padgette S. R., Kishore, G. M. (1991). Structure and 
Topological Symmetry of the Glyphosate Target 5-enol-pyruvylshikimate-3- 
phosphate ^th*** P T nr Mat.l Aca A. Sni.. USA 88, 5046-5050. 

Svab, Z., Hajdukiewicz, P., and MaUga, P. (1990) Stable transformation of 
plastids in higher plants. TW M«*t.l A^d.Sri.USA 87: 8526-8530. 

Svab, Z. and Maliga, P. (1993) ffigh frequency plastid transformation in 
tobacco by selection for a chimeric aadA gene. Proc. Natl- Acad Sch USA 
90:913-917. 

Tabor, S. and Richardson. C. C. (1985) A bacteriophage T7 RNA 
polymerase/promoter system for controlled exclusive expression of specific 
genes. TW Nnt.1. Acad. Sci. USA 82: 1074-1078. 



-103- 



38-21(10660)A 



Talbot, H. W., Johnson, L. M. and Munnecke, D. M. (1984) Glyphosate 
utilization by Pseudomonas sp. and Alcaligenes sp. isolated from environmental 
sources. Current Microbiol. 10: 255-260. 

Talmadge, K, and£ilbert, W., (1980) "Construction of plasmid vectors with 
unique PstI cloning sites in the signal sequence coding region" 
Gene . 12: 235-241. 

Timko, M.P., Herdies, L., de Almeida, E., Cashmore, A.R., Leemans, J., and 
Krebbers, E. 1988. Genetic Engineering of Nuclear-Encoded Components of 
the Photosynthetic Apparatus in Arabidopsis in "The Impact of Chemistry on 
Biotechnology," ACS Books, 279-295. 

Vasil, V., F. Redway and I. Vasil. (1990), Bio/Technology 8:429-434. 

Vieira, J. and Messing J. (1987) Production of single-stranded plasmid DNA. 
Methods Enzvmol. 153: 3-11. 

Yanisch-Perron, C, Vieira, J. and Messing, J. (1985). Improved M13 phage 
cloning vectors and host strains: nucleotide sequences of the M13mpl8 and 
pUC19 vectors. Gene 33, 103-119 



Annex A 



38-21(10535) 

PCT/US91/6148 (WO 92/04449) filed August 28, 1991 designated countries: 



Australia 

European Patent Office (Austria, Belgium, Denmark, France, Great Britain, 
Greece, Germany, Holland, Italy, Luxemburg, 
Sweden, Spain, Switzerland) 

Japan 

Russian Federation 



-196- 



38-21(10660)A 



CLAIMS : 



1. An isolated DNA sequence other than the structural coding 
sequence listed in SEQID NO:41, SEQ ID NO:43 SEQ ID NO:66 and SEQ ID 
NO:68, encoding an EPSPS enzyme having the sequence domains: 
-R-X r H-X 2 -E- (SEQ ID NO:37), in which 
X! is G, S, T, C, Y, N, Q, D or E; 
X 2 is S or T; and \ 
-G-D-K-X3- (SEQ ID NO:38), in which 

X3isSorT;and 
-S-A-Q-X4-K- (SEQ ID NO:39), in which 

X4isA,R,N, D, C, Q, E, G, H, I, L, K,M, F, P, S, T, W, Y or V; and 
-N-X5-T-R- (SEQ ID NO:40), in which 

X5 is A, R, N, D, C, Q, E, G H, I, L, K, M, F, P, S, T, W, Y or V. 



3. A DNA molecule of Claim 1 in which the Ki/K m ratio is between 25 
and 500. 



is S or T; X4 is V. I or L; andXs is P or Q. 

5. A DNA molecule of Claim 4 which encodes an EPSPS enzyme 
having the sequence of SEQ ID NO:3. 



2. A DNA moleWle of Claim 
forphosphoenolpyruvate is between 2 and 25 uM. 




1 in which the K m 



4. 



A DNA molecule of Claim 1 in which Xi is D or N; X 2 is S or T; X 3 



6. 



A DNA molecule of Claim 5 having the sequence of SEQ ID NO:2. 
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7. A DNA molecule of Claim 5 having the sequence of SEQ IDNO:9. 

8. A recombinant, double-stranded DNA molecule comprising in 
sequence: 

a) a promoter which functions in plant cells to cause the production 
of an RNA sequence; 

b) a structural DN& sequence that causes the production of an 
RNA sequence which encodes a EPSPS enzyme having the sequence 
domains: \ 

-R-X r H-X 2 -E- (SEQ ID Nb:37), in which 

Xi is G, S, T, C, YMfo, D or E; 

X 2 is S or T; and // y 
-G-D-K-X3- (SEQ ID NO:3^m which 

X3 is S or T; and / 
-S-A-Q-X4-K- (SEQ ID NO:39), in which 

X4 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y or V; and 
-N-X5-T-R- (SEQ ID NO:40), in which 

X5 is A, R, N, D, C, Q, E, G, H, I, L. K, M, F, P, S, T, W, Yor V; 

and 

c) a 3* non-translated region which functions in plant cells to cause 
the addition of a stretch of polyadenyl nucleotides to the 3' end of the 
RNA sequence; 

where the promoter is heterologous with respect to the structural DNA 
sequence and adapted to cause sufficient expression of the encoded EPSPS 
enzyme to enhance the glyphosate tolerance of a plant cell transformed with 
the DNA molecule. \ 
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9. A DNA\ molecule of Claim 8 in which the structural DNA 
sequence encodes a\fusion polypeptide comprising an amino-terminai 
chloroplast transit peptide and the EPSPS enzyme. 

10. A DNA molecule of Claim 8 in which Xi is D or N; X 2 is S or T; X3 
isSorT;X4isV,IorL; aLlXsisPorQ. 

11. A DNA molecule of Claim 10 in which the structural DNA 
sequence encodes an EPSPS enzyme selected from the group consisting of 
SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:41 and SEQ ID 
NO:43. A 

12. A DNA molecule of Claim 9 in which X x is D or N; X 2 is S or T; X3 
isSorTjXtisVJorL; andfeislforQ. 

13. A DNA molecule of Claim 12 in which the structural DNA 
sequence encodes an EPSPS enzyme selected from the group consisting of 
SEQ ID NO:2. SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:41 and SEQ ID 

NO:43. 

14. A DNA molecule of Claim 12 in which the EPSPS sequence is 
SEQ ID NO:3. 

15. A DNA molecule of Claim 14 in which the promoter is a plant 
DNA virus promoter. , 

16. A DNA molecule of Claim 15 in which the promoter is selected 
from the group consisting of CaMV35S and FMV35S promoters. 
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17. A DNA molecule of Claim 14 in which the the structural DNA 
sequence encodes a\ chloroplast transit peptide selected from the group 
consisting of SEQ ID NO:ll and SEQ ID NO:15. 

18. A DNA molecule of Claim 17 in which the 3' non-translated region 
is selected from the group consisting of the NOS 3' and the E9 3* non- 
translated regions. \ 

\ 

19. A method of producing genetically transformed plants which are 
tolerant toward glyphosate herbicide, comprising the steps of: 

a) inserting intoiihe genome of a plant cell a recombinant, double- 
stranded DNA molecule comprising: 

i) a promoter wl^clTfunctions in plant cells to cause the production 
of an RNA sequence, J 

ii) a structural DNA sequence that causes the production of an 
RNA sequence which encodes an EPSPS enzyme having the sequence 
domains: 

-R-Xi-H-X 2 -E- (SEQ ID NO:37), in which 

X! is G, S, T,C,Y, N, Q, D or E: 

X2 is S or T; and 
-G-D-K-X3- (SEQ ID NO-.38), in which 

X3isSorT;and 
-S-A-Q-X4-K- (SEQ ID NO:39), in which 

XosA, R, N,D,C, Q, E, G, H, I. L,K,M, F, P, S, T,W,YorV;and 
-N-X5-T-R- (SEQ ID NO-.40), in which 

X5 is A, R. N. D, C Q, E, G, H, I, L, K, M, F. P, S, T, W, Y or V; 

and 
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iii) a 3' non-translated DNA sequence which functions in plant cells 
to cause the addition of a stretch of polyadenyi nucleotides to the 3' end 
of the RNA sequence 
where the promoter is heterologous with respect to the structural DNA 
sequence and adapted to cause sufficient expression of the polypeptide to 
enhance the glyphosate tolerance of a plant cell transformed with the DNA 
molecule; \ 

b) obtaining a transformed plant cell; and fc 

c) regenerating from the transformed plant cell a genetically 
transformed plant which has increased tolerance to glyphosate herbicide. 



20. A method of Claim/l7 in which Xi is D or N; X2 is S or T; X3 is S or 

/[ 

T; X4 is V, I or L; and X5 is P or Q- 

I 1 

21. A method of Claim 20 in which the structural DNA sequence 
encodes an EPSPS enzyme selected from the group consisting of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:41 and SEQ ID NO:43. 

22. A method of Claim 19 in which the structural DNA sequence 
encodes a fusion polypeptide comprising an amino-terminal chloroplast transit 
peptide and the EPSPS enzyme. 

23 . A method of Claim 22 in which Xi is D or N; X 2 is S or T; X 3 is S or 
T;X4isV,IorL; andXsisPorQ. 

24. A method of Claim 23 in which the structural DNA sequence 
encodes an EPSPS enzyme selected from the group consisting of SEQ ID 
NO:2, SEQ ID NO:4. SEQ ID NO:6, SEQ ID NO:42 and SEQ ID NO:44. 
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25. A method of Claim 23 in which the EPSPS enzyme is that set 
forthinSEQIDNO:3. 

26. A method of ClAim 25 in which the promoter is from a plant DNA 

virus. 

27. A method of Claim 26 in which the promoter is selected from the 
group consisting of CaMV35S and FMV35S promoters. 

\ 

28. A glyphosate-tolerant plant cell comprising a DNA molecule of 
Claims 9, 12 or 14. 



29. A glyphosate-tolefla&t plant cell of Claim 28 in which the 
^rom^ 



promoter is a plant DNA virus prompter. 



/ 

30. A glyphosate-tolerant plant cell of Claim 29 in which the 
promoter is selected from the group consisting of CaMV35S and FMV35S 
promoters. 

31. A glyphosate-tolerant plant cell of Claim 28 selected from the 
group consisting of corn, wheat, rice, barley, soybean, cotton, sugarbeet, oilseed 
rape, canola, flax, sunflower, potato, tobacco, tomato, alfalfa, poplar, pine, 
eukalyptus, apple, lettuce, peas, lentils, grape and turf grasses. 

32. A glyphosate-tolerant plant comprising plant cells of Claim 31. 

33. A glyphosate-tolerant plant of Claim 32 in which the promoter is 
from a DNA plant virus promoter. 
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34. A glyphosate-tolerant plant of Claim 33 in which the promoter is 
selected from the group consisting of CaMV35S and FMV35S promoters. 

35. A glyphosate-tolerant plant of Claim 34 selected from the group 
consisting of corn, wheat, rice, barley, soybean, cotton, sugarbeet, oilseed rape, 
canola, flax, sunflower, potato, tobacco, tomato, alfalfa, poplar, pine, 
eukalyptus, apple, lettuce, peas, lentils, grape and turf grasses. 

36. A method for selectively controlling weeds in a field containing a 
crop having planted crop seeds or plants comprising the steps of: 

a) planting the crop seeds or plants which are glyphosate-tolerant 
as a result of a recombinant double-stranded DNA molecule being inserted into 
the crop seed or plant, the DNA molecule having: 

i) a promoter whicl funi itions in plant cells to cause the production 
of an RNA sequence, ' / 

ii) a structural DNA sequence that causes the production of an 
RNA sequence which encodes an EPSPS enzyme having the sequence 
domains: 

-R-X r H-X 2 -E- (SEQ ID NO:37), in which 

Xi is G, S, T, C, Y, N, Q, D or E; 

X 2 is S or T; and 
-G-D-K-X3- (SEQ ID NO:38), in which 

X3 is S or T; and 
-S-A-Q-X4-K- (SEQ ID NO:39), in which 

X4 is A, R, N*D,C,Q, E, G, H, I, L, K, M, F, P, S, T, W, Y or V; and 
-N-X5-T-R- (SEQ ID NO-.40), in which 

X5 is A R, N, D, C, Q, E, G, H,Y L, K, M. F, P, S, T, W, Y or V; 

and 
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iii) a 3' non-translated DNA sequence which functions in plant cells 
to cause the addition of a stretch of polyadenyl nucleotides to the 3' end 
of the RNA sequence 
where the promoter is heterologous with respect to the structural DNA 
sequence and adapted to cause sufficient expression of the EPSPS enzyme to 
enhance the glyphosate tolerance of the crop plant transformed with the DNA 
molecule; and 

b) applying to the crop and weeds in the field a sufficient amount of 
glyphosate herbicide to control the weeds without significantly affecting the 
crop. 

37. A method of Cla&i\36 in which Xi is D or N; X 2 is S or T; X3 is S or 
T;X4isV,IorL; andXsisPot^. 

38. A method of Claim &7 in which the structural DNA sequence 
encodes an EPSPS enzyme selected from the sequences as set forth in SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:42 and SEQ ID NO:44. 

39. A method of Claim 36 in which the structural DNA sequence 
encodes a fusion polypeptide comprising an amino-terminal chloroplast transit 
peptide and the EPSPS enzyme. 

40. A method of Claim 39 in which Xi is D or N; X 2 is S or T; X 3 is S or 
T;X4 is V, I or L; andXs is P or Q. 

41. A method of Claim 40 in which the structural DNA sequence 
encodes an EPSPS enzyme selected from the sequences as set forth in SEQ ID 
NO:2. SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:41 and SEQ ID NO:43. 
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42. A method of Claim 40 in which the DNA molecule encodes an 
EPSPS enzyme as set forth in SEQ ID NO:3. 

43. A method of Claim 42 in which the DNA molecule further 
comprises a promoter selected from the group consisting of the CAMV35S and 
FMV35S promoters. \ 

44. A method of Claim 43 in which the crop plant is*elected from the 
group consisting of corn, whe&t, rice, barley, soybean, cotton, sugarbeet, oilseed 
rape, canola, flax, sunflower) potato, tobacco, tomato, alfalfa, poplar, pine, 
eukalyptus, apple, lettuce, peas, lentils, grape and turf grasses. 



45. A DNA molecule 6f /Claim 9 in which the structural DNA 
sequence encodes a chloroplast /transit peptide selected from the group 
consisting of SEQ ID NO:ll, SEQ^ID NO:13, SEQ ID NO:15 and SEQ ID 
NO:17. 

46. A DNA molecule of Claim 45 in which the chloroplast transit 
peptide is encoded by a DNA sequence selected from the group consisting of 
SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14 and SEQ ID NO:16. 

47. A DNA molecule of Claim 9 in which the structural DNA 
sequence encodes a chloroplast transit peptide selected from the group 
consisting of SEQ ID NO:ll and SEQ ID NO:15. 

48. A DNA molecule of Claim 47 in which the chloroplast transit 
peptide is encoded by a DNA sequence selected from the group consisting of 
SEQ ID NO: 10 and SEQ ID NO. 14. 
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49. A DNA molecule of Claim 45 in which the promoter is selected 
from the group consisting of CaMV 35S and FMV 35S promoters. 

50. A DNA molecule of Claim 46 in which the promoter is selected 
from the group consisting of CaMV 35S and FMV 35S promoters. 

\ 

51. A DNA molecule of Claim 47 in which the promoter is selected 
from the group consisting of Ca^fV 35S and FMV 35S promoters. 

52. A DNA molecule of Claim 48 in which the promoter is selected 
from the group consisting of CaMV 35S and FMV 35S promoters. 



53. A DNA molecule of jCla^m 49 in which the 3' non-translated region 
is selected from the group consisting of the NOS 3' and the E9 3' non- 
translated regions. J 



54. A DNA molecule of Claim 50 in which the 3' non-translated region 
is selected from the group consisting of the NOS 3' and the E9 3' non- 
translated regions. 

55. A DNA molecule of Claim 51 in which the 3' non-translated region 
is selected from the group consisting of the NOS 3' and the E9 3' non- 
translated regions. 



56. A DNA molecule of Claim 52 in which the 3' non-translated region 
is selected from the group consisting of the NOS 3' and the E9 3' non- 
translated regions. 
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57. A DNA^molecule of Claim 53 in which the structural DNA 
sequence encodes an^EPSPS enzyme selected from the group consisting of 
SEQ ID NO:3, SEQ IE( NO:5, SEQ ID NO:7, SEQ ID NO:42 and SSEQ ID 
NO:44. 

58. A DNA molecule of Claim 54 in which the structural DNA 
sequence encodes an EPSPS enzyme selected from the group consisting of 

SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:42 and SEQ ID 
NO:44. 



59. A DNA molecule of Claim 55 in which the structural DNA 
sequence encodes an EPSPS enzyme selected from the group consisting of 
SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:42 and SEQ ID 
NO:44. \A 

60. A DNA molecull oTClaim 56 in which the structural DNA 
sequence encodes an EPSPS enzyme selected from the group consisting of 
SEQ ID NO:3, SEQ ID NOrSr-SEQ ID NO:7, SEQ ID NO:42 and SEQ ID 
NO:44. 

61. A DNA molecule of Claim 57 in which the structural DNA 
sequence contains an EPSPS encoding sequence selected from the group 
consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:41 and 
SEQ ID NO:43. 

62. A DNA molecule of Claim, 58 in which the structural DNA 
sequence contains an EPSPS encoding sequence selected from the group 
consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:41 and 
SEQ ID NO:43. 
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63. A DNA molecule of Claim 59 in which the structural DNA 
sequence contains an EPSPS encoding sequence selected from the group 
consisting of SEQ ID NO:2,)SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:41 and 
SEQIDNO:43. \ 

\ 

64. A DNA molecule of Claim 60 in which the structural DNA 
sequence contains an EPSPS encoding sequence selected from the group 
consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:41 and 
SEQIDNO:43. \ 

65. A DNA molecule of Claim 53 in which the structural DNA 
sequence encodes an EPSPS enzyme having the sequence of SEQ ID NO:3. 

66. A DNA molecule of Claim 54 in which the structural DNA 
sequence encodes an EPSPS enzyWe having the sequence of SEQ ID NO:3. 

67. A DNA molecule of Claim 55 in which the structural DNA 
sequence encodes an EPSPS enzyme having the sequence of SEQ ID NO:3. 

68. A DNA molecule of Claim 56 in which the structural DNA 
sequence encodes an EPSPS enzyme having the sequence of SEQ ID NO:3. 

69. A DNA molecule of Claim 65 in which the structural DNA 
sequence contains an EPSPS encoding sequence selected from the group 
consisting of SEQ ID NO:2 and SEQ ID NO:9. 

70. A DNA molecule of Claim 66 in which the structural DNA 
sequence contains an EPSPS encoding sequence selected from the group 
consisting of SEQ ID NO:2 and SEQ ID NO:9, 
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71. A DNA molecule of Claim 67 in which the structural DNA 
sequence contains an EPSPS encoding sequence selected from the group 
consisting of SEQ ID NO:2 aW SEQ ID NO:9. 

72. A DNA molecule of Claim 68 in which the structural DNA 
sequence contains an EPSP^S encoding sequence selected from the group 
consisting of SEQ ID NO:2 ana, SEQ ID NO:9. 

\ 

73. A glyphosate-tolerant plant cell of Claim 29 in which: 

(a) the promoter is selected from the group consisting of CaMV 35S 
and FMV 35S promoters; 

(b) the structural DNA sequence encodes: 

(i) a chloroplast /Bransit peptide selected from the group 



consisting of SEQ ID NO 
ID NO:17; and 



> 11JSEQ ID NO:13, SEQ ID NO:15 and SEQ 

' h 

(ii) an EPSPS enzyme selected from the group consisting of 
SEQ ID NO:3, SEQ ID NO^SEQ ID NO:7, SEQ ID NO:42 and SEQ ID 
NO-.44; and 

(c) the 3' non-translated region is selected from the group consisting 
of the NOS 3' and the E9 3' non-translated regions. 

75. A glyphosate-tolerant plant cell of Claim 73 in which the 
structural DNA sequence comprises: 

(a) a chloroplast transit peptide encoding DNA sequence selected 
from the group consisting of SEQ ID NO:10. SEQ ID NO:12, SEQ ID NO:14 
and SEQ ID NO:16; and 

(b) an EPSPS encoding sequence' selected from the group consisting 
of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:41 and SEQ ID 
NO:43. 
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76. A glyphosatte-tolerant plant cell of Claim 73 in which the 
structural DNA sequence comprises: 

(a) a chloroplast transit peptide encoding DNA sequence selected 
from the group consisting of SjEQ ID NO: 10 and SEQ ID NO:14; and 

(b) a DNA sequence encoding an EPSPS enzyme having the 
sequence of SEQ ID NO:3. \ 

77. A glyphosate- tolerant plant cell of Claim 74 in which the 
structural DNA sequence comprises an EPSPS encoding sequence selected 
from the group consisting of SEQ ID NO:2 and SEQ ID NO:9. 

78. A glyphosate-tolerai^t \>lant cell of Claim 75 selected from the 
group consisting of corn, wheat, ricejbarley, soybean, cotton, sugarbeet, oilseed 
rape, canola, flax, sunflower, polatp^tobacco, tomato, alfalfa, poplar, pine, 
eukalyptus, apple, lettuce, peas, lentils/ grape and turf grasses. 

/ 

79. A glyphosate-tolerant plant comprising a DNA molecule of 
Claims 9, 12 or 14 in which: 

(a) the promoter is selected from the group consisting of CaMV 35S 
and FMV 35S promoters; 

(b) the structural DNA sequence encodes: 

(i) a chloroplast transit peptide selected from the group 
consisting of SEQ ID NO:ll, SEQ ID NO:13, SEQ ID NO:15 and SEQ 
ID NO: 17; and 

(ii) an EPSPS enzyme selected from the group consisting of 
SEQ ID NO-.3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:42 and SEQ ID 
NO:44; and 

(c) the 3' non-translated region is selected from the group consisting 
of the NOS 3' and the E9 3' non-translated regions. 
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80. A glyphosate-tolerant plant of Claim 79 in which the structural 
DNA sequence comprises: \ 

(a) a chloroplast transit peptide encoding DNA sequence selected 
from the group consisting of SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO:14 
and SEQ ID NO:16; and \ 

(b) an EPSPS encoding sequence selected from the group consisting 
of SEQ ID NO:2, SEQ ID NOW, SEQ ID NO:6, SEQ ID NO:41 and SEQ ID 
NO:43. 

81. A glyphosate-tolerant-plant of Claim 80 in which the structural 
DNA sequence comprises: / 

(a) a chloroplast transit peptide encoding DNA sequence selected 
from the group consisting of SEfy'ID NO:10 and SEQ ID NO:14; and 

(b) a DNA sequence /encoding an EPSPS enzyme having the 
sequence of SEQ ID NO:3. \ 

82. A glyphosate-tolerant plant of Claim 81 in which the structural 
DNA sequence comprises an EPSPS encoding sequence selected from the 
group consisting of SEQ ID NO:2 and SEQ ID NO:9. 

83. A glyphosate-tolerant plant of Claim 82 selected from the group 
consisting of corn, wheat, rice, barley, soybean, cotton, sugarbeet, oilseed rape, 
canola, flax, sunflower, potato, tobacco, tomato, alfalfa, poplar, pine, 
eukalyptus, apple, lettuce, peas, lentils, grape and turf grasses. 

84. A seed of a glyphosate- tolerant plant of Claim 32. 

85. A seed of a glyphosate- tolerant plant of Claim 35. 



86. 
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A seed ofa glyphosate- tolerant plant of Claim 79. 



87 . A seed of a glyphosate- tolerant plant of Claim 80. 

88. A seed of a glyphosate- tolerant plant of Claim 8 1. 

89 . A seed of a glyphosate- tolerant plant of Claim 82. 

90. A seed of a glyphosate- tolerant plant of Claim 83. 

91. A transgenic soybean plant which contains a heterologous gene 
which encodes an EPSPS enzyme having a Km for phosphoenolpyruvate (PEP) 
between 1 and 150 uM and a K i (glyphosate)/K m (PEP) ratio between about 2 
and 500, said plant exhibiting tolerance to N-phosphonomethylglycine 
herbicide at a rate of 1 lb/acre^dthout significant yield reduction due to 
herbicide application. 

92. Seed of a soybean plant of Claim 9 1 . 



93. In a method for the transformation and regeneration of 
transgenic plants, the improvement which comprises the use of a glyphosate- 
resistance marker gene comprising:: 



of an RNA sequence, 

ii) a structural DNA sequence that causes the production of an 
RNA sequence which encodes an EPSPS enzyme having the sequence 




i) a promoter which functions in plant cells to cause the production 



domains: 
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-R-X1-H-X2-E- (SEQ ID NO:37), in which 
Xi is G, S, T, C)\Y, N, Q, D or E; 
X 2 is S or T; and\ 

-G-D-K-X3- (SEQ ID NO:38), in which 



X3 is S or T; and 
-S-A-Q-X4-K- (SEQ ID NO:39), in which 

X4isA,R,N,D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y or V; and 
-N-X5-T-R- (SEQ ID NO:40), in which 

X5isA,R, N, D, C, Q, E, G,H, I, L,K,M, F, P, S, T, W, Y or V; 



iii) a 3' non-translated DNA sequence which functions in plant cells 
to cause the addition of a stretch of polyadenyl nucleotides to the 3' end 
of the RNA sequence; 
where the promoter is heterologous with respect to the structural DNA 
sequence and adapted to cause sufficient expression of the polypeptide to 
render a plant cell transformed witlj^the DNA molecule tolerance to a toxic 
level of glyphosate. 



94. A method of Claim 93 in which Xi is D or N; X 2 is S or T; X3 is S or 
TiXtisV.IorL; and X5 is P or Q/"^ X \_ y 



95. A method of Claim 94 in which the structural DNA sequence 
encodes an EPSPS enzyme selected from the group consisting of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:41 and SEQ ID NO:43. 

96. A method of Claim 93 in which the structural DNA sequence 
encodes a fusion polypeptide comprising an amino-terminal chloroplast transit 
peptide and the EPSPS enzyme. 



and 
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ABSTRACT OF THE DISCLOSURE 

Genes encoding Class II EPSPS enzymes are disclosed. The genes are 
useful in producing transformed bacteria and plants which are tolerant to 
glyphosate herbicide. Class II EPSPS genes share little homology with known, 
Class I EPSPS genes, and do not hybridize to probes from Class I EPSPS's. 
The Class II EPSPS enzymes are characterized by being more kinetically 
efficient than Class I EPSPS's in the presence of glyphosate. Plants 
transformed with Class II EPSPS genes are also disclosed as well as a method 
for selectively controlling weeds in a planted transgenic crop field. 
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APPLICATION FOR IJNTTED STATES PATENT 
DECLARATION * POWER OF ATTORNEY * PFTTTTO^ 

AS A BELOW-NAMED INVENTOR, I hereby declare that: 
MY RESIDENCE, citizenship, and post office address are as stated below, 
next to my name. 

I BELIEVE I am: 

1. C ] the original, first and sole inventor, 

2. [X] an original, first and joint inventor, 

of the subject matter which is claimed and for which 
a patent is sought on the invention entitled 

3. PH^Pr^TE^SYM^Fl ^ EN0 ^ mVmgHIK ^ TO - 3 - 
the specification of which, with any Preliminary Amendment, 

4. [X] is attached hereto 

5. Q was filed on 
5(a). as application Serial No. 

6. [] including Amendment(s) filed on (date) 

and (date) 

7. [] together with any Amendment(s) filed herewith. 

I HEREBY STATE that I have reviewed and understand the contents of the 
above-identified Specification, including the Claims, as amended by any Amendments) 
referred to above. 

I ACKNOWLEDGE my Duty to Disclose information of which I am aware 
which is material to the Examination of this Application in accordance with Title 37, Code 
of Federal Regulations. §L56(a) including any such information which occurred between 
the filing date of any prior application listed below for which the benefit of Title 35, United 
States Code §120 is claimed and the filing date of this Application. 

I HEREBY STATE that the subject matter which is claimed in any 
Amendment(s) referred to above was part of my or our invention and was invented before 
the filing of this Application. 

RF.NF.FTT OF EARLIER FILING DATE 

THIS APPLICATION in whole or in part discloses and claims subject 
matter disclosed in and I hereby claim the benefit under Title 35, United States Code, 
§120 of any of my or our prior United States applications) listed below: 
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SERIAL NO. FILING DATE STATUS 

8. Q?/749,gU_ Q8-28-91 Pending 

07/573,537 08-31-90 Abandoned 

I HEREBY CLAIM foreign priority benefits under Title 35, United States 
Code, §119 of any foreign application^) for patent or Inventor's Certificate(s) listed 
below: 

NUMBER COUNTRY FILING date 

9. v 



Any foreign application(s) for patent or Inventor's Certificate(s) filed by me 
or us which claims or discloses all or any part of the subject matter claimed in this 
Application and which has a filing date before that of the above-listed application^) on 
which foreign priority is claimed is identified below: 

NU MBE R COUNTRY FILING DATE 

10. 

AS TO ANY subject matter which is claimed in this Application which is not 
common to any above-identified prior application^) for which the benefit of 35 USC §119 
or §120 is claimed, I do not know and do not believe that the same was ever known or 
used in the United States before my or our invention or discovery thereof, or patented or 
described in any printed publication in any country before my or our invention thereof or 
more than one year prior to the date of this Application, or in public use or on sale in the 
United States more than one year prior to the date of this Application, that said subject 
matter has not been patented or made the subject of an Inventor's Certificate issued 
before the date of this Application in any country foreign to the United States on an 
application filed by me or my legal representatives or assigns more than twelve months 
prior to this Application. . 

AS TO ANY subject matter which is claimed in this Application which is 
common to any above-identified prior application s) for which the benefit of 35 USC §120 
is claimed, I do not know and believe that the same was ever known or used in the United 
States before my or our invention or discovery thereof, or patented or described in any 
printed publication in any country before my or our invention thereof or more than one 
year prior to the earliest of said prior application s) to which said subject matter is 
common, or in public use or on sale in the United States more than one year prior to the 
earliest of said prior application s) to which said subject matter is common, that said 
subject matter has not been patented or made the subject of an Inventor's Certificate 
issued before the date of the earliest of said prior application s) to which said subject 
matter is common in any country foreign to the United States on an application filed by 
me or my legal representatives or assigns more than twelve months prior to the earliest 
of said prior applications) to which said subject matter is common. 
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11. [X] ALL APPLICATION(S), if any, for patent or Inventor's Certificate on 

any part of said subject claimed in this Application filed by me or my representatives or 
assigns in any country foreign to the United States of America in addition to any listed 
above on which priority is claimed are listed in Annex A, attached hereto. 

I HEREBY appoint the following as my attorney(s) and/or agent(s) of record 
with full power of substitution and revocation to prosecute this Application and to 
transact all business in the Patent and Trademark Office connected therewith. 



12. Dennis R Hoerner, Jr. 

Richard H. Shear 
James C. Bolding 
Grace L. Bonner 
Lawrence M- Lavin, Jr. 



Reg. No. 30,914 

Reg. No. 26,583 

Reg. No. 26,843 

Reg. No. 32,963 

Reg. No. 30,768 



ALL correspondence/telephone calls in connection with this Application 
should be directed to: 



13. Dennis R. Hoerner, Jr. - BB4F 

MONSANTO COMPANY 

700 Chesterfield Parkway North 

St. Louis, Missouri 63198 
13(a). Telephone Number: (314) 537-6099 

I FURTHER declare that all statements made herein of my own knowledge 
are true and that all statements made on information and belief are believed to be true; 
and further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, under 
Section 1001 of Title 18 of the United States Code, and that such willful false statements 
may jeopardize the validity of the Application or any patent issuing thereon. 
= . WHEREFORE, I PRAY that Letters Patent be granted to me solely or 

5 jointly with the additional inventors) (if any) named below for the invention described and 
fi claimed in the above-identified specification and claims, and I hereby subscribe my name 
to the above-identified specification and claims, Declaration, Power of Attorney and this 
Petition. 

14(a) SOLE or FIRST JOINT INVENTOR, 

full name: Gerard Francis garry 

RESIDENCE (State/Country) 

CITIZENSHIP: 

POST OFFICE ADDRESS: 



INVENTOR'S SIGNATURE: 
DATE: 



Missouri/United States 

Republic o f Ireland 

fi350 Wate rman Avenue 
St. Louis. Missouri 63130 



14(b). 



14(c). 



14(c). 



SECOND JOINT INVENTOR (if 
any), fall name: 

RESIDENCE (State/Country): 

CITIZENSHIP: 

POST OFFICE ADDRESS: 

INVENTOR'S SIGNATURE: 
DATE: 

THIRD JOINT INVENTOR (if 
any), fall name: 

RESIDENCE (State/Country): 

CITIZENSHIP: 

POST OFFICE ADDRESS: 

INVENTOR'S SIGNATURE: 
DATE: 

FOURTH JOINT INVENTOR (if 
any), full name: 

RESIDENCE (State/Country): 

CITIZENSHIP: 

POST OFFICE ADDRESS: 

INVENTOR'S SIGNATURE: 
DATE: 
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Ganesh Mi^^y^^^ 

United Sfot* a 

15354 Grantlev 
Chesterfield. IV^™ ^ ffi tn^? 

... ^f.<0^^»v f ^ €f tf 



Stephen Rogers P^g*^ 

Missouri./TTnited Sf ff ^ ff 

United StotAQ 

553 Nantucket Pnmt» 
Grover. Mfosouri fi3fl4n 



William Carlton Stalling 
Missouri/United St»te« 

United gtatea 

19165 QldLo^n r R^d 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Barry, Gerard F. 

Kishore, Ganesh M. 
Padgette, Stephen R. 
Stallings, William C. 

(ii) TITLE OF INVENTION: Glyphosate Tolerant 

5 - Eno Ipy ruvy 1 shi kima t e - 3 - Pho spha t e Sy n t has e s 

(iii) NUMBER OF SEQUENCES: 69 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Dennis R. Hoerner, Jr., Monsanto Co. BB4F 

(B) STREET: 700 Chesterfield Village Parkway 

(C) CITY: St. Louis 

(D) STATE: Missouri 

(E) COUNTRY: USA 

(F) ZIP: 63198 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

<D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vii) PRIOR APPLICATION DATA: 

>A) APPLICATION NUMBER: US 07/749,611 
.3) FILING DATE: 2S-AUG-1991 
(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/576,537 
(3) FILING DATE: 21-AUG-I990 
■'O CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION : 

{ A) NAME: Hoerner Jr., Dennis R. 
3) REGISTRATION NUMBER: 20,914 
C) REFERENCE /' DOCKET NUMBER: 2=-2I{10660)A 

(IX) TELECOMMUNICATION INFORMATION : 
(A) TELEPHONE : (214)527-5099 
5) TELEFAX: . 2 14 ) 5 2 7 047 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 597 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TCATCAAAAT ATTTAGCAGC ATTCCAGATT GGGTTCAATC AACAAGGTAC GAGCCATATC 60 

ACTTTATTCA AATTGGTATC GCCAAAACCA AGAAGGAACT CCCATCCTCA AAGGTTTGTA 120 

AGGAAGAATT CTCAGTCCAA AGCCTCAACA AGGTCAGGGT ACAGAGTCTC CAAACCATTA 180 

GCCAAAAGCT ACAGGAGATC AATGAAGAAT CTTCAATCAA AGTAAACTAC TGTTCCAGCA 240- 

CATGCATCAT GGTCAGTAAG TTTCAGAAAA AGACATCCAC CGAAGACTTA AAGTTAGTGG 300 

GCATCTTTGA AAGTAATCTT GTCAACATCG AGCAGCTGGC TTGTGGGGAC CAGACAAAAA 360 

AGGAATGGTG CAGAATTGTT AGGCGCACCT ACCAAAAGCA TCTTTGCCTT TATTGCAAAG 420 

ATAAAGCAGA TTCCTCTAGT ACAAGTGGGG AACAAAATAA CGTGGAAAAG AGCTGTCCTG 480 

ACAGCCCACT CACTAATGCG TATGACGAAC GCAGTGACGA CCACAAAAGA ATTCCCTCTA 540 

TATAAGAAGG CATTCATTCC CATTTGAAGG ATCATCAGAT ACTAACCAAT ATTTCTC 597 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1982 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 62.. 1426 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

AAGCCCGCGT TCTCTCCGGC GCTCCGCCCG GAGAGCCGTG GATAGATTAA GGAAGACGCC 

C ATG TCG CAC GGT GCA AGC AGC CGG CCC GCA ACC GCC CGC AAA TCC 
Met Ser His Gly Ala Ser Ser Arg Pro Ala Thr Ala Arg Lys Ser 
15 10 15 

TCT GGC CTT TCC GGA ACC GTC CGC ATT CCC GGC GAC AAG TCG ATC TCC 
Ser Gly Leu Ser Gly Thr Val Arg He Pro Gly Asp Lys Ser He Ser 
20 25 30 ' 

CAC CGG TCC TTC ATG TTC GGC GGT CTC GCG AGC GGT GAA ACG CGC ATC 
His Arg Ser Phe Met Phe Gly Gly Leu Ala Ser Gly Glu Thr Arg He 
35 40 45 

ACC GGC CTT CTG GAA GGC GAG GAC GTC ATC AAT ACG GGC AAG GCC ATG 
Thr Gly Leu Leu Glu Gly Glu Asp Val He Asn Thr Gly Lys Ala Met 
50 55 60 

CAG GCC ATG GGC GCC AGG ATC CGT AAG GAA GGC GAC ACC TGG ATC ATC 
Gin Ala Met Gly Ala Arg He Arg Lys Glu Gly Asp Thr Trp He He 



GAT GGC GTC GGC AAT GGC GGC CTC CTG GCG CCT GAG GCG CCG CTC GAT 
Asp Gly Val Gly Asn Gly Gly Leu Leu Ala Pro Glu Ala Pro Leu Asp 
30 85 90 95 

TTC GGC AAT GCC GCC ACG GGC TGC CGC CTG ACC ATG GGC CTC GTC GGG 
Phe Gly Asn Ala Ala Thr Gly Cys Arg Leu Thr Met Gly Leu Val Gly 
100 105 HO 

GTC TAC GAT TTC GAC AGC ACC TTC ATC GGC GAC GCC TCG CTC ACA AAG 
Val Tvr Aso Phe Asp Ser Thr Phe He Gly Asp Ala Ser Leu Thr Lys 
115 120 1-5 

CGC CCG ATG GGC CGC GTG TTG AAC CCG CTG CGC GAA ATG GGC GTG CAG 
Arg Pro Met Gly Arg Val Leu Asn Pro Leu Arg Glu Met Gly Val Gin 
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GTG AAA TCG GAA GAC GGT GAC CGT CTT CCC GTT ACC TTG CGC GGG CCG 538 
Val Lys Ser Glu Asp Gly Asp Arg Leu Pro Val Thr Leu Arg Gly Pro 
145 150 155 

AAG ACG CCG ACG CCG ATC ACC TAC CGC GTG CCG ATG GCC TCC GCA CAG 586 
Lys Thr Pro Thr Pro lie Thr Tyr Arg Val Pro Met Ala Ser Ala Gin 
160 1S5 170 175 

GTG AAG TCC GCC "GTG CTG CTC GCC GGC CTC AAC ACG CCC GGC ATC ACG 634 
Val Lys Ser Ala Val Leu Leu Ala Gly Leu Asn Thr Pro Gly He Thr 
180 185 190 

ACG GTC ATC GAG CCG ATC ATG ACG CGC GAT CAT ACG GAA AAG ATG CTG 682 
Thr Val He Glu Pro He Met Thr Arg Asp His Thr Glu Lys Met Leu 
195 200 205 

CAG GGC TTT GGC GCC AAC CTT ACC GTC GAG ACG GAT GCG GAC GGC GTG 730 
Gin Gly Phe Gly Ala Asn Leu Thr Val Glu Thr Asp Ala Asp Gly Val 
210 215 220 

CGC ACC ATC CGC CTG GAA GGC CGC GGC AAG CTC ACC GGC CAA GTC ATC 778 
Arg Thr He Arg Leu Glu Gly Arg Gly Lys Leu Thr Gly Gin Val He 
225 230 235 

GAC GTG CCG GGC GAC CCG TCC TCG ACG GCC TTC CCG CTG GTT GCG GCC 826 
Asp Val Pro Gly Asp Pro Ser Ser Thr Ala Phe Pro Leu Val Ala Ala 
240 245 250 255 

CTG CTT GTT CCG GGC TCC GAC GTC ACC ATC CTC AAC GTG CTG ATG AAC 874 
Leu Leu Val Pro Gly Ser Asp Val Thr lie Leu Asn Val Leu Met Asn 
250 265 270 

CCC ACC CGC ACC GGC CTC ATC CTG ACG CTG CAG GAA ATG GGC GCC GAC 922 
Pro Thr Arg Thr Gly Leu He Leu Thr Leu Gin Glu Met Gly Ala Asp 
275 280 285 

ATC GAA GTC ATC AAC CCG CGC CTT GCC GGC GGC GAA GAC GTG GCG GAC 970 
lie Glu Val He Asn Pro Arg Leu Ala Gly Gly Glu Asp Val Ala Asp 
290 295 300 

CTG CGC GTT CGC TCC TCC ACG CTG AAG GGC GTC ACG GTG CCG GAA GAC 1018 
Leu Arg Val Ara Ser Ser Thr Leu Lys Gly Val Thr Val Fro Glu Asp 
205 210 315 

CGC GCG CCT TCG ATG ATC GAC GAA TAT CCG ATT CTC GCT GTC GCC GCC 1066 
Arg Ala Pro Ser Met lie Asp Glu Tyr Pro He Leu Ala Val Ala Ala 
320 325 330 335 
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GCC TTC GCG GAA GGG GCG ACC GTG ATG AAC GGT CTG GAA GAA CTC CGC 1114 
Ala Phe Ala Glu Gly Ala Thr Val Met Asn Gly Leu Glu Glu Leu Arg 
340 345 350 

GTC AAG GAA AGC GAC CGC CTC TCG GCC GTC GCC AAT GGC CTC AAG CTC 1162 
Val Lys Glu Ser Asp Arg Leu Ser Ala Val Ala Asn Gly Leu Lys Leu 
355 360 365 

AAT GGC GTG GAT TT3C GAT GAG GGC GAG ACG TCG CTC GTC GTG CGC GGC 1210 
Asn Gly Val Asp Cys Asp Glu Gly Glu Thr Ser Leu Val Val Arg Gly 
370 375 380 

CGC CCT GAC GGC AAG GGG CTC GGC AAC GCC TCG GGC GCC GCC GTC GCC 1258 
Arg Pro Asp Gly Lys Gly Leu Gly Asn Ala Ser Gly Ala Ala Val Ala 
385 390 395 

ACC CAT CTC GAT CAC CGC ATC GCC ATG AGC TTC CTC GTC ATG GGC CTC 1306 
Thr His Leu Asp His Arg He Ala Met Ser Phe Leu Val Met Gly Leu 
400 405 410 415 

GTG TCG GAA AAC CCT GTC ACG GTG GAC GAT GCC ACG ATG ATC GCC ACG 1354 
Val Ser Glu Asn Pro Val Thr Val Asp Asp Ala Thr Met He Ala Thr 
420 425 430 

AGC TTC CCG GAG TTC ATG GAC CTG ATG GCC GGG CTG GGC GCG AAG ATC 1402 
Ser Phe Pro Glu Phe Met Asp Leu Met Ala Gly Leu Gly Ala Lys He 
435 440 445 

GAA CTC TCC GAT ACG AAG GCT GCC TGATGACCTT CACAATCGCC ATCGATGGTC 1456 
Glu Leu Ser Asp Thr lys Ala Ala 
450 455 

CCGCTGCGGC CGGCAAGGGG ACGCTCTCGC GCCGTATCGC GGAGGTCTAT GGCTTTCATC 1516 

ATCTCGATAC GGGCCTGACC TATCGCGCCA CGGCCAAAGC GCTGCTCGAT CGCGGCCTGT 1576 

CGCTTGATGA CGAGGCGGTT GCGGCCGATG TCGCCCGCAA TCTCGATCTT GCCGGGCTCG 1636 

ACCGGTCGGT GCTGTCGGCC CATGCCATCG GCGAGGCGGC TTCGAAGATC GCGGTCATGC 1596 

CCTCGGTGCG GCGGGCGCTG GTCGAGGCGC AGCGCAGCTT TGCGGCGCGT GAGCCGGGCA 1756 

CGGTGCTGGA 7GGACGCGAT ATCGGCACGG TGGTCTGCCC GGATGCGCCG GTGAAGCTCT 1916 

ATGTCACCGC GTCACCGGAA GTGCGCGCGA AACGCCGCTA TGACGAAATC CTCGGCAATG 137 6 

GCGGGTTGGC CGATTACGGG ACGATCCTCG AGGATATCCG CCGCCGCGAC GAGCGGGACA 193 6 
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TGGGTCGGGC GGACAGTCCT TTGAAGCCCG CCGACGATGC GCACTT 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 455 amino acids 
<B) .TYPE: amino acid 
(D) "TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Ser His Gly Ala Ser Ser Arg Pro Ala Thr Ala Arg Lys Ser Ser 
15 10 15 

Gly Leu Ser Gly Thr Val Arg He Pro Gly Asp Lys Ser He Ser His 
20 25 30 

Arg Ser Phe Met Phe Gly Gly Leu Ala Ser Gly Glu Thr Arg He Thr 
35 40 45 

Gly Leu Leu Glu Gly Glu Asp Val He Asn Thr Gly Lys Ala Met Gin 
50 55 60 

Ala Met Gly Ala Arg lie Arg Lys Glu Gly Asp Thr Trp He He Asp 
65 70 75 80 

Gly Val Glv Asn Gly Gly Leu Leu Ala Pro Glu Ala Pro Leu Asp Phe 
85 90 95 

Gly Asn Ala Ala Thr Gly Cys Arg Leu Thr Met Gly Leu Val Gly Val 
100 105 HO 

Tyr Asp Phe Asp Ser Thr Phe He Gly Asp Ala Ser Leu Thr Lys Arg 
115 ' 120 125 

Pro Met Glv Arg Val Leu Asn Pro Leu Arg Glu Met Gly Val Gin Val 
130 ' 135 140 

lys Ser Glu Asd Glv Asn Ara Leu Fro Val Thr Leu Arg Gly Pro Lys 
145 " 150 155 160 

^hr Fro Thr Pro He Thr Tyr Arg Val Fro Met Ala Ser Ala Gin Val 
165 170 175 
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Lys Ser Ala Val Leu Leu Ala Gly Leu Asn Thr Pro Gly lie Thr Thr 
180 185 190 

val lie Glu Pro He Met Thr Arg Asp His Thr Glu Lys Met Leu Gin 
195 200 205 

Gly Phe Gly Ala Asn Leu Thr Val Glu Thr Asp Ala Asp Gly Val Arg 
210 215 220 

Thr He Arg Leu Glu Gly Arg Gly Lys Leu Thr Gly Gin Val He Asp 
225 * 230 235 240 

Val Pro Gly Asp Pro Ser Ser Thr Ala Phe Pro Leu Val Ala Ala Leu 
245 250 255 

Leu Val Pro Gly Ser Asp Val Thr lie Leu Asn Val Leu Met Asn Pro 
260 265 270 

Thr Arg Thr Gly Leu He Leu Thr Leu Gin Glu Met Gly Ala Asp He 
275 280 285 

Glu Val He Asn Pro Arg Leu Ala Gly Gly Glu Asp Val Ala Asp Leu 
290 295 300 

Arg Val Arg Ser Ser Thr Leu Lys Gly Val Thr Val Pro Glu Asp Arg 
305 310 315 320 

Ala Pro Ser Met He Asp Glu Tyr Pro lie Leu Ala Val Ala Ala Ala 
325 330 335 

Phe Ala Glu Gly Ala Thr Val Met Asn Gly Leu Glu Glu Leu Arg Val 
340 345 350 

Lys Glu Ser Asp Arg Leu Ser Ala Val Ala Asn Gly Leu Lys Leu Asn 

355 360 2 55 

Gly Val Asp Cys Asp Glu Gly Glu Thr Ser Leu Val Val Arg Gly Arg 
370 375 380 

Pro Asp Gly Lys Gly Leu Gly Asn Ala Ser Gly Ala Ala Val Ala Thr 



385 



390 



395 



Kis Leu Asp 



405 



e Ala Met Ser Phe Leu Val Mec Gly Leu Val 
410 415 



Ser Gli 



Asn Pro Val 
420 



ir Val Asp Asd Ala Thr Met He Ala Thr Ser 
425 430 
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Phe Pro Glu Phe Met Asp Leu Met Ala Gly Leu Gly Ala Lys He Glu 
435 440 445 

Leu Ser Asp Thr Lys Ala Ala 
450 455 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LERGTH: 1673 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 86.. 1432 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GTAGCCACAC ATAATTACTA TAGCTAGGAA GCCCGCTATC TCTCAATCCC GCGTGATCGC 60 

GCCAAAATGT GACTGTGAAA AATCC ATG TCC CAT TCT GCA TCC CCG AAA CCA 112 
Met Ser His Ser Ala Ser Pro Lys Pro 

1 5 

GCA ACC GCC CGC CGC TCG GAG GCA CTC ACG GGC GAA ATC CGC ATT CCG 150 
Ala Thr Ala Arg Arg Ser Glu Ala Leu Thr Gly Glu He Arg He Pro 
10 15 20 25 

GGC GAC AAG TCC ATC TCG CAT CGC TCC TTC ATG TTT GGC GGT CTC GCA 208 
Gly Asp Lys Ser lie Ser His Arg Ser Phe Met Phe Gly Gly Leu Ala 
30 35 40 

TCG GGC GAA ACC CGC ATC ACC GGC CTT CTG GAA GGC GAG GAC GTC ATC 256 
Ser Gly Glu Thr Arg He Thr Gly Leu Leu Glu Gly Glu Asp Val He 
45 50 55 

AAT ACA GGC CGC GCC ATG CAG GCC ATG GGC GCG AAA ATC CGT AAA GAG 
3 04 Asn Thr Gly Arc Ala Met Gin Ala Met Gly Ala Lys He Arg Lys Glu 
60 55 -0 

GGC GAT GTC TC-G ATC ATC AAC GGC GTC GGC AAT GGC TGC CTG TTG CAG 3 52 

Gly Asp Val Tfp He He Asn Gly Val Gly Asn Gly Cys Leu Leu Gin 
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CCC GAA GCT GCG CTC GAT TTC GGC AAT GCC GGA ACC GGC GCG CGC CTC 
Pro Glu Ala Ala Leu Asp Phe Gly Asn Ala Gly Thr Gly Ala Arg Leu 
90 95 100 105 

ACC ATG GGC CTT GTC GGC ACC TAT GAC ATG AAG ACC TCC TTT ATC GGC 
Thr Met Gly Leu Val Gly Thr Tyr Asp Met Lys Thr Ser Phe lie Gly 
110 115 120 

GAC GCC TCG CTG TCG AAG CGC CCG ATG GGC CGC GTG CTG AAC CCG TTG 
Asp Ala Ser Leu Ser Lys Arg Pro Met Gly Arg Val Leu Asn Pro Leu 
125 130 135 

CGC GAA ATG GGC GTT CAG GTG GAA GCA GCC GAT GGC GAC CGC ATG CCG 
Arg Glu Met Gly Val Gin Val Glu Ala Ala Asp Gly Asp Arg Met Pro 
140 145 150 

CTG ACG CTG ATC GGC CCG AAG ACG GCC AAT CCG ATC ACC TAT CGC GTG 
Leu Thr Leu lie Gly Pro Lys Thr Ala Asn Pro lie Thr Tyr Arg Val 
155 160 165 

CCG ATG GCC TCC GCG CAG GTA AAA TCC GCC GTG CTG CTC GCC GGT CTC 
Pro Met Ala Ser Ala Gin Val Lys Ser Ala Val Leu Leu Ala Gly Leu 
170 175 180 185 

AAC ACG CCG GGC GTC ACC ACC GTC ATC GAG CCG GTC ATG ACC CGC GAC 
Asn Thr Pro Gly Val Thr Thr Val lie Glu Pro Val Met Thr Arg Asp 
190 195 200 

CAC ACC GAA AAG ATG CTG CAG GGC TTT GGC GCC GAC CTC ACG GTC GAG 
His Thr Glu Lys Met Leu Gin Gly Phe Gly Ala Asp Leu Thr Val Glu 
205 210 215 

ACC GAC AAG GAT GGC GTG CGC CAT ATC CGC ATC ACC GGC CAG GGC AAG 
Thr Asp Lvs Asp Gly Val Arg His He Arg He Thr Gly Gin Gly Lys 
220 225 230 

CTT GTC GGC CAG ACC ATC GAC GTG CCG GGC GAT CCG TCA TCG ACC GCC 
Leu Val Gly Gin Thr He Asp Val Pro Gly Asp Pro Ser Ser Thr Ala 
235 240 245 

TTC CCG CTC GTT GCC GCC CTT CTG GTG GAA GGT TCC GAC GTC ACC ATC 
=he p -o Leu Val Ala Ala Leu Leu Val Glu Gly Ser Asp Val Thr He 
: 50 255 250 265 

CGC AAC GTG CTG ATG AAC CCG ACC CGT ACC GGC CTC ATC CTC ACC TTG 
Arg Asn Val Leu Met Asn Pro Thr Arg Thr Gly Leu He Leu Thr Leu 
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270 275 280 

CAG GAA ATG GGC GCC GAT ATC GAA GTG CTC AAT GCC CGT CTT GCA GGC 976 
Gin Glu Met Gly Ala Asp lie Glu Val Leu Asn Ala Arg Leu Ala Gly 
285 290 295 

GGC GAA GAC GTC GCC GAT CTG CGC GTC AGG GCT TCG AAG CTC AAG GGC 1024 
Gly Glu Asp Val Ala Asp Leu Arg Val Arg Ala Ser Lys Leu Lys Gly 
300 305 310 

GTC GTC GTT CCG CCG GAA CGT GCG CCG TCG ATG ATC GAC GAA TAT CCG 1072 
Val Val Val Pro Pro Glu Arg Ala Pro Ser Met lie Asp Glu Tyr Pro 
315 320 325 

GTC CTG GCG ATT GCC GCC TCC TTC GCG GAA GGC GAA ACC GTG ATG GAC 1120 
Val Leu Ala lie Ala Ala Ser Phe Ala Glu Gly Glu Thr Val Met Asp 
330 335 340 345 

GGG CTC GAC GAA CTG CGC GTC AAG GAA TCG GAT CGT CTG GCA GCG GTC 1168 
Gly Leu Asp Glu Leu Arg Val Lys Glu Ser Asp Arg Leu Ala Ala Val 
350 355 360 

GCA CGC GGC CTT GAA GCC AAC GGC GTC GAT TGC ACC GAA GGC GAG ATG 1216 
Ala Arg Gly Leu Glu Ala Asn Gly Val Asp Cys Thr Glu Gly Glu Met 
365 370 375 

TCG CTG ACG GTT CGC GGC CGC CCC GAC GGC AAG GGA CTG GGC GGC GGC 1264 
Ser Leu Thr Val Arg Gly Arg Pro Asp Gly Lys Gly Leu Gly Gly Gly 
380 385 390 

ACG GTT GCA ACC CAT CTC GAT CAT CGT ATC GCG ATG AGC TTC CTC GTG 1312 
Thr Val Ala Thr His Leu Asp His Arg lie Ala Met Ser Phe Leu Val 
395 400 405 

ATG GGC CTT GCG GCG GAA AAG CCG GTG ACG GTT GAC GAC AGT AAC ATG 1360 
Met Gly Leu Ala Ala Glu Lys Pro Val Thr Val Asp Asp Ser Asn Met 
410 415 420 425 

ATC GCC ACG TCC TTC CCC GAA TTC ATG GAC ATG ATG CCG GGA TTG GGC 1408 
He Ala Thr Ser Phe Pro Glu Phe Met Asp Met Met Pro Gly Leu Gly 
430 435 440 

GCA AAG ATC GAG TTG AGC ATA CTC TAGTCACTCG ACAGCGAAAA TATTATTTGC 1462 
Ala Lys lie Glu Leu Ser lie Leu 
445 



GAGATTGGGC ATTATTACCG GTTGGTCTCA GCGGGGGTTT AATGTCCAAT CTTCCATACG 1522 
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TAACAGCATC AGGAAATATC AAAAAAGCTT TAGAAGGAAT TGCTAGAGCA GCGACGCCGC 1582 
CTAAGCTTTC TCAAGACTTC GTTAAAACTG TACTGAAATC CCGGGGGGTC CGGGGATCAA 1642 
ATGACTTCAT TTCTGAGAAA TTGGCCTCGC A 167 3 

' (2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 449 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Ser His Ser Ala Ser Pro Lys Pro Ala Thr Ala Arg Arg Ser Glu 
15 10 15 

Ala Leu Thr Gly Glu lie Arg lie Pro Gly Asp Lys Ser lie Ser His 
20 25 30 

Arg Ser Phe Met Phe Gly Gly Leu Ala Ser Gly Glu Thr Arg lie Thr 



35 



40 



45 



Gly Leu Leu Glu Gly Glu Asp Val lie Asn Thr Giy Arg Ala Met Gin 
50 55 50 



Ala Met Gly Ala Lys lie Arg Lys Glu Gly Asp Val Trp He He Asn 
65 TO "5 80 



Gly Val Gly Asn Gly Cys Leu Leu Gin Pro Glu Ala Ala Leu Asp Phe 
85 90 95 



Gly Asn Ala Gly Thr Giy Ala Arg Leu Thr Met Giy 
100 105 



.eu Val Gly Thr 
110 



Tyr Asp Met Lys Thr 
115 



!er Phe He Gly Asp Ala Ser 
120 



.eu Ser Lys Arg 



Pro Met Gly Arg Val 
130 



.eu Asn Pro Leu Arg Glu Met 
135 140 



Val Gin Val 
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Glu Ala Ala Asp Gly Asp Arg Met Pro Leu Thr Leu He Gly Pro Lys 
145 150 155 160 

Thr Ala Asn Pro He Thr Tyr Arg Val Pro Met Ala Ser Ala Gin Val 
165 170 175 

Lys Ser Ala Val Leu Leu Ala Gly Leu Asn Thr Pro Gly Val Thr Thr 
180 185 190 

Val He Glu Pro Val Met Thr Arg Asp His Thr Glu Lys Met Leu Gin 
195 200 205 

Gly Phe Gly Ala Asp Leu Thr Val Glu Thr Asp Lys Asp Gly Val Arg 
210 215 220 

His He Arg He Thr Gly Gin Gly Lys Leu Val Gly Gin Thr He Asp 
225 230 235 240 

Val Pro Gly Asp Pro Ser Ser Thr Ala Phe Pro Leu Val Ala Ala Leu 
245 250 255 

Leu Val Glu Gly Ser Asp Val Thr He Arg Asn Val Leu Met Asn Pro 
260 265 270 

Thr Arg Thr Gly Leu He Leu Thr Leu Gin Glu Met Gly Ala Asp He 
275 280 285 

Glu Val Leu Asn Ala Arg Leu Ala Gly Gly Glu Asp Val Ala Asp Leu 
290 295 200 

Arg Val Arg Ala Ser Lys Leu Lys Gly Val Val Val Pro Pro Glu Arg 
305 310 215 320 

Ala Pro Ser Met He Asp Glu Tyr Pro Val Leu Ala lie Ala Ala Ser 
325 330 335 

Phe Ala Glu Gly Glu Thr Val Met Asp Gly Leu Asp Glu Leu Arg Val 
340 345 3 50 



Lys Glu Ser Asp Arg Leu Ala Ala Val Ala Arg Gly Leu Glu Ala Asn 
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His Arg He Ala Met Ser Phe Leu Val Met Gly Leu Ala Ala Glu Lys 
405 410 415 

Pro Val Thr Val Asp Asp Ser Asn Met lie Ala Thr Ser Phe Pro Glu 
420 425 430 

Phe Met Asp Met Met Pro Gly Leu Gly Ala Lys lie Glu Leu Ser He 
435 440 445 

Leu i 



(2) INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 34.. 1380 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GTGATCGCGC CAAAATGTGA CTGTGAAAAA TCC ATG TCC CAT TCT GCA TCC CCG 

Met Ser His Ser Ala Ser Pro 
1 5 

AAA CCA GCA ACC GCC CGC CGC TCG GAG GCA CTC ACG GGC GAA ATC CGC 
Lys Pro Ala Thr Ala Arg Arg Ser Glu Ala Leu Thr Gly Glu He Arg 
10 15 20 

ATT CCG GGC GAC AAG TCC ATC TCG CAT CGC TCC . TTC ATG TTT GGC GGT 
He Pro Gly Asp Lys Ser lie Ser His Arg Ser Phe Met Phe Gly Gly 



C^C GCA TCG GGC GAA ACC CGC ATC ACC GGC CTT CTG GAA GGC GAG GAC 
Leu Ala Ser Glv Glu Thr Arg He Thr Gly Leu Leu Glu Gly Glu Asp 
40 ' 45 50 ^5 
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GTC ATC AAT ACA GGC CGC GCC ATG CAG GCC ATG GGC GCG AAA ATC CGT 
Val He Asn Thr Gly Arg Ala Met Gin Ala Met Gly Ala Lys He Arg 
60 65 ' 70 

AAA GAG GGC GAT GTC TGG ATC ATC AAC GGC GTC GGC AAT GGC TGC CTG 
Lys Glu Gly Asp Val Trp He He Asn Gly Val Gly Asn Gly Cys Leu 
75 80 85 

TTG CAG CCC GAA GCT GCG CTC GAT TTC GGC AAT GCC GGA ACC GGC GCG 
Leu Gin Pro Glu Ala Ala Leu Asp Phe Gly Asn Ala Gly Thr Gly Ala 
90 95 100 

CGC CTC ACC ATG GGC CTT GTC GGC ACC TAT GAC ATG AAG ACC TCC TTT 
Arg Leu Thr Met Gly Leu Val Gly Thr Tyr Asp Met Lys Thr Ser Phe 
105 110 115 

ATC GGC GAC GCC TCG CTG TCG AAG CGC CCG ATG GGC CGC GTG CTG AAC 
He Gly Asp Ala Ser Leu Ser Lys Arg Pro Met Gly Arg Val Leu Asn 
120 125 130 135 

CCG TTG CGC GAA ATG GGC GTT CAG GTG GAA GCA GCC GAT GGC GAC CGC 
Pro Leu Arg Glu Met Gly Val Gin Val Glu Ala Ala Asp Gly Asp Arg 
140 145 150 

ATG CCG CTG ACG CTG ATC GGC CCG AAG ACG GCC AAT CCG ATC ACC TAT 
Met Pro Leu Thr Leu He Gly Pro Lys Thr Ala Asn Pro He Thr Tyr 
155 160 165 

CGC GTG CCG ATG GCC TCC GCG CAG GTA AAA TCC GCC GTG CTG CTC GCC 
Arg Val Pro Met Ala Ser Ala Gin Val Lys Ser Ala Val Leu Leu Ala 
170 175 180 

GGT CTC AAC ACG CCG GGC GTC ACC ACC GTC ATC GAG CCG GTC ATG ACC 
Gly Leu Asn Thr Pro Gly Val Thr Thr Val He Glu Pro Val Met Thr 
185 190 195 

CGC GAC CAC ACC GAA AAG ATG CTG CAG GGC TTT GGC GCC GAC CTC ACG 
Arg Asp His Thr Glu Lys Met Leu Gin Gly ?he Gly Ala Asp Leu Thr 
200 205 210 



215 



GTC GAG ACC GAC AAG GAT GGC GTG CGC CAT ATC CGC ATC ACC GGC CAG 
Val Glu Thr Asd Lys Aso Glv Val Arg His He Arg He Thr Gly Gin 
220 225 230 

GGC AAG CTT GTC GGC CAG ACC ATC GAC GTG CCG GGC GAT CCG TCA TCG 
Gly Lys Leu Val Gly Gin Thr He Asp Val Pro Gly Asp Pro Ser Ser 
235 240 - 245 
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ACC GCC TTC CCG CTC GTT GCC GCC CTT CTG GTG GAA GGT TCC GAC GTC 
Thr Ala Phe Pro Leu Val Ala Ala Leu Leu Val Glu Gly Ser Asp Val 
250 255 260 

ACC ATC CGC AAC GTG CTG ATG AAC CCG ACC CGT ACC GGC CTC ATC CTC 
Thr He Arg Asn Val Leu Met Asn Pro Thr Arg Thr Gly Leu He Leu 
265 270 275 

ACC TTG CAG GAA AT~G GGC GCC GAT ATC GAA GTG CTC AAT GCC CGT CTT 
Thr Leu Gin Glu Met Gly Ala Asp He Glu Val Leu Asn Ala Arg Leu 
280 285 290 295 

GCA GGC GGC GAA GAC GTC GCC GAT CTG CGC GTC AGG GCT TCG AAG CTC 
Ala Gly Gly Glu Asp Val Ala Asp Leu Arg Val Arg Ala Ser Lys Leu 
300 305 310 

AAG GGC GTC GTC GTT CCG CCG GAA CGT GCG CCG TCG ATG ATC GAC GAA 
Lys Gly Val Val Val Pro Pro Glu Arg Ala Pro Ser Met He Asp Glu 
315 320 325 

TAT CCG GTC CTG GCG ATT GCC GCC TCC TTC GCG GAA GGC GAA ACC GTG 
Tyr Pro Val Leu Ala He Ala Ala Ser Phe Ala Glu Gly Glu Thr Val 
330 335 340 

ATG GAC GGG CTC GAC GAA CTG CGC GTC AAG GAA TCG GAT CGT CTG GCA 
Met Asp Gly Leu Asp Glu Leu Arg Val Lys Glu Ser Asp Arg Leu Ala 
345 350 355 

GCG GTC GCA CGC GGC CTT GAA GCC AAC GGC GTC GAT TGC ACC GAA GGC 
Ala Val Ala Arg Glv Leu Glu Ala Asn Gly Val Asp Cys Thr Glu Gly 
360 365 370 375 

GAG ATG TCG CTG ACG GTT CGC GGC CGC CCC GAC GGC AAG GGA CTG GGC 
Glu Met Ser Leu Thr Val Arg Glv Ara Pro Asp Gly Lys Gly Leu Gly 
380 385 390 

GGC GGC ACG GTT GCA ACC CAT CTC GAT CAT CGT ATC GCG ATG AGC TTC 
Gly Gly Thr Val Ala Thr His Leu Asp His Arg He Ala Met Ser Phe 
395 400 405 

CTC GTG ATG GGC CTT GCG GCG GAA AAG CCG GTG ACG GTT GAC GAC AGT 
Leu Val Met Gly Leu Ala Ala Glu Lys Pro Val Thr Val Asp Asp Ser 



410 



420 



AAC ATG ATC GCC ACG TCC TTC CCC GAA TTC ATG GAC ATG ATG CCG GGA 
Asn Met He Ala Thr Ser Phe Fro Glu Phe Met Asp Met Met Pro Gly 
425 430 435 
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TTG GGC GCA AAG ATC GAG TTG AGC ATA CTC TAGTCACTCG ACAGCGAAAA 1400 
Leu Gly Ala Lys lie Glu Leu Ser lie Leu 
440 445 

TATTATTTGC GAGATTGGGC ATTATTACCG GTTGGTCTCA GCGGGGGTTT AATGTCCAAT 1460 
CTTCCATACG TAACAGCATC AGGAAATATC AAAAAAGCTT 1500 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 449 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Ser His Ser Ala Ser Pro Lys Pro Ala Thr Ala Arg Arg Ser Glu 
15 10 15 

Ala Leu Thr Gly Glu lie Arg lie Pro Gly Asp Lys Ser He Ser His 
20 25 30 

Arg Ser Phe Met Phe Gly Gly Leu Ala Ser Gly Glu Thr Arg He Thr 
35 40 45 

Gly Leu Leu Glu Gly Glu Asp Val He Asn Thr Gly Arg Ala Met Gin 



Ala Met Gly Ala Lys He Arg Lys Glu Gly Asp Val Trp He He Asn 

65 70 75 80 

Glv Vai Gly Asn Gly Cys Leu Leu Gin Pro Glu Ala Ala Leu Asp Phe 

95 90 ?5 

Gly Asn Ala Gly Thr Gly Ala Arg Leu Thr Met Gly leu Val Gly Thr 

100 105 HO 

Tvr Asd Met Lvs Thr £er Phe He Glv Asd Ala Ser Leu Ser Lys Arg 

115 120 125 

Pro Met Gly Arg Val Leu Asn Pro Leu Arg Glu Met C-iy Val Gin Val 



-120- 



38-21(10660)A 



Glu Ala Ala Asp Gly Asp Arg Met Pro Leu Thr Leu lie Gly Pro Lys 
145 150 155 160 

Thr Ala Asn Pro He Thr Tyr Arg Val Pro Met Ala Ser Ala Gin Val 
165 170 175 

Lys Ser Ala Val Leu Leu Ala Gly Leu Asn Thr Pro Gly Val Thr Thr 
180 185 190 

Val He Glu Pro Vafc Met Thr Arg Asp His Thr Glu Lys Met Leu Gin 
195 200 205 

Gly Phe Gly Ala Asp Leu Thr Val Glu Thr Asp Lys Asp Gly Val Arg 
210 215 220 

His He Arg He Thr Gly Gin Gly Lys Leu Val Gly Gin Thr He Asp 
225 230 235 240 

Val Pro Gly Asp Pro Ser Ser Thr Ala Phe Pro Leu Val Ala Ala Leu 
245 250 255 

Leu Val Glu Gly Ser Asp Val Thr He Arg Asn Val Leu Met Asn Pro 
260 265 270 

Thr Arg Thr Gly Leu He Leu Thr Leu Gin Glu Met Gly Ala Asp He 
275 280 285 

Glu Val Leu Asn Ala Arg Leu Ala Gly Gly Glu Asp Val Ala Asp Leu 
290 295 300 

Arg Val Arg Ala Ser Lys Leu Lys Gly Val Val Val Pro Pro Glu Arg 
305 310 315 320 

Ala Pro Ser Met He Asp Glu Tyr Pro Val Leu Ala He Ala Ala Ser 
325 330 335 

Phe Ala Glu Gly Glu Thr Val Met Asp Gly Leu Asp Glu Leu Arg Val 
340 345 350 

Lys Glu Ser Asp Arg Leu Ala Ala Val Ala Arg Gly Leu Glu Ala Asn 
355 360 365 

Gly Val Asp Cys Thr Gla Gly Glu Met Ser Leu Thr Val Arg Gly Arg 
370 275 3S0 

Pro Asp Gly Lys Gly Leu Gly Gly Gly Thr Val Ala Thr His Leu Asp 
385 3?Q 395 400 
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His Arg He Ala Met Ser Phe Leu Val Met Gly Leu Ala Ala Glu Lys 
405 410 415 

Pro Val Thr Val Asp Asp Ser Asn Met He Ala Thr Ser Phe Pro Glu 
420 425 430 

Phe Met Asp Met Met Pro Gly Leu Gly Ala Lys He Glu Leu Ser He 
435 440 445 

Leu 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 423 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Ser Leu Thr Leu Gin Pro He Ala Arg Val Asp Gly Thr He Asn Leu 
1 5 10 15 

Pro Glv Ser Lvs Thr Val Ser Asn Arg Ala Leu Leu Leu Ala Ala Leu 
20 25 30 

Ala His Gly Lys Thr Val Leu Thr Asn Leu Leu Asp Ser Asp Asp Val 
35 40 45 

Arg His Met Leu Asn Ala Leu Thr Ala Leu Gly Val Ser Tyr Thr Leu 
50 55 60 

Ser Ala Aso Ara Thr Arg Cvs Glu He He Gly Asn Gly Gly Pro Leu 
65 -0 75 80 

His Ala Glu Glv Ala Leu Glu Leu Phe Leu Gly Asn Ala Gly Thr Ala 
35 ?0 95 

Met Ara Fro Leu Ala Ala Ala Leu Cys Leu Gly Ser Asn Asp He Val 
100 105 HO 
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Leu Thr Gly Glu Pro Arg Met Lys Glu Arg Pro He Gly His Leu Val 
115 120 125 

Asp Ala Leu Arg Leu Gly Gly Ala Lys He Thr Tyr Leu Glu Gin Glu 
130 135 140 

Asn Tyr Pro Pro Leu Arg Leu Gin Gly Gly Phe Thr Gly Gly Asn Val 
145 150 155 160 

Asp Val Asp Gly Ser Val Ser Ser Gin Phe Leu Thr Ala Leu Leu Met 
165 170 175 

Thr Ala Pro Leu Ala Pro Glu Asp Thr Val He Arg He Lys Gly Asp 
180 185 190 

Leu Val Ser Lys Pro Tyr He Asp He Thr Leu Asn Leu Met Lys Thr 
195 200 205 

Phe Gly Val Glu He Glu Asri Gin His Tyr Gin Gin Phe Val Val Lys 
210 215 220 

Gly Gly Gin Ser Tyr Gin Ser Pro Gly Thr Tyr Leu Val Glu Gly Asp 
225 230 235 240 

Ala Ser Ser Ala Ser Tyr Phe Leu Ala Ala Ala Ala He Lys Gly Gly 
245 250 255 

Thr Val Lys Val Thr Gly He Gly Arg Asn Ser Met Gin Gly Asp He 
260 265 270 

Arg Phe Ala Asp Val Leu Glu Lys Met Gly Ala Thr He Cys Trp Gly 
275 280 285 

Asp Asd Tyr He Ser Cys Thr Arg Gly Glu Leu Asn Ala He Asp Met 
290 295 300 

Asp Met Asn His He Pro Asp Ala Ala Met Thr He Ala Thr Ala Ala 
305 310 315 320 

Leu Fhe Ala Lys Gly Thr Thr Arg Leu Arg Asn He Tyr Asn Trp Arg 



325 



335 



Val Lys Glu 



340 



ir Asd Ara Leu Fhe Ala Met 
10 3 45 



Thr Glu Leu Arg Lys 
350 



He Arg lie Thr Pro 
365 
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Pro Glu Lys Leu Asn Phe Ala Glu He Ala Thr Tyr Asn Asp His Arg 
370 375 380 

Met Ala Met Cys Phe Ser Leu Val Ala Leu Ser Asp Thr Pro Val Thr 
385 390 395 400 

He Leu Asp Pro Lys Cys Thr Ala Lys Thr Phe Pro Asp Tyr Phe Glu 
i 405 410 415 

Gin Leu Ala Arg He Ser Gin 
420 

(2) INFORMATION FOR SEQ ID NO: 9: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1377 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

CCATGGCTCA CGGTGCAAGC AGCCGTCCAG CAACTGCTCG TAAGTCCTCT GGTCTTTCTG 60 

GAACCGTCCG TATTCCAGGT GACAAGTCTA TCTCCCACAG C-TCCTTCATG TTTGGAGGTC 120 

TCGCTAGCGG TGAAACTCGT ATCACCGGTC TTTTGGAAGG TGAAGATGTT ATCAACACTG 180 

GTAAGGCTAT GCAAGCTATG GGTGCCAGAA TCCGTAAGGA AGGTGATACT TGGATCATTG • 240 

ATGGTGTTGG TAACGGTGGA CTCCTTGCTC CTGAGGCTCC TCTCGATTTC GGTAACGCTG 300 

CAACTGGTTG CCGTTTGACT ATGGGTCTTG TTGGTGTTTA CG ATTTCGAT AGCACTTTCA 360 

TTGGTGACGC TTCTCTCACT AAGCGTCCAA TGGGTCGTGT GTTGAACCCA CTTCGCGAAA 420 

TGGGTGTGCA GGTGAAGTCT GAAGACGGTG ATCGTCTTCC AGTTACCTTG CGTGGACCAA 480 

AGACTCCAAC GCC.AATCACC TACAGGGTAC CTATGGCTTC CGCTCAAGTG AAGTCCGCTG S40 

TTCTGCTTGC TGGTCTCAAC ACCCCAGGTA TCACCACTGT TATCGAGCCA ATCATGACTC 600 

GTGACCACAC TGAAAAGATG CTTCAAGGTT TTGGTGCTAA CCTTACCGTT GAGACTGATG 560 
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CTGACGGTGT GCGTACCATC CGTCTTGAAG GTCGTGGTAA GCTCACCGGT CAAGTGATTG 720 

ATGTTCCAGG TGATCCATCC TCTACTGCTT TCCCATTGGT TGCTGCCTTG CTTGTTCCAG 780 

GTTCCGACGT CACCATCCTT AACGTTTTGA TGAACCCAAC CCGTACTGGT CTCATCTTGA 840 

CTCTGCAGGA AATGGGTGCC GACATCGAAG TGATCAACCC ACGTCTTGCT GGTGGAGAAG 900 

ACGTGGCTGA CTTGCGTGTT CGTTCTTCTA CTTTGAAGGG TGTTACTGTT CCAGAAGACC 960 

GTGCTCCTTC TATGATCGAC GAGTATCCAA TTCTCGCTGT TGCAGCTGCA TTCGCTGAAG 1020 

GTGCTACCGT TATGAACGGT TTGGAAGAAC TCCGTGTTAA GGAAAGCGAC CGTCTTTCTG 1080 

CTGTCGCAAA CGGTCTCAAG CTCAACGGTG TTGATTGCGA TGAAGGTGAG ACTTCTCTCG 1140 

TCGTGCGTGG TCGTCCTGAC GGTAAGGGTC TCGGTAACGC TTCTGGAGCA GCTGTCGCTA 1200 

CCCACCTCGA TCACCGTATC GCTATGAGCT TCCTCGTTAT GGGTCTCGTT TCTGAAAACC 1260 

CTGTTACTGT TGATGATGCT ACTATGATCG CTACTAGCTT CCCAGAGTTC ATGGATTTGA 1320 

TGGCTGGTCT TGGAGCTAAG ATCGAACTCT CCGACACTAA GGCTGCTTGA TGAGCTC 1377 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 318 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 37.. 2 17 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO.-10: 

AGATCTATCG ATAAGCTTGA TGTAATTGGA GGAAGATCAA AATTTTCAAT CCCCATTCTT 

CGATTGCTTC AATTGAAGTT TCTCCG ATG GCG CAA GTT AGC AGA ATC TGC AAT 
Met Ala Gin Val Ser Arg He Cys Asn 
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GGT GTG CAG AAC CCA TCT CTT ATC TCC AAT CTC TCG AAA TCC AGT CAA 
Gly Val Gin Asn Pro Ser Leu lie Ser Asn Leu Ser Lys Ser Ser Gin 
10 15 20 25 

CGC AAA TCT CCC TTA TCG GTT TCT CTG AAG ACG CAG CAG CAT CCA CGA 
Arg Lys Ser Pro Leu Ser Val Ser Leu Lys Thr Gin Gin His Pro Arg 
30 35 40 

GCT TAT CCG ATT TCG TCG TCG TGG GGA TTG AAG AAG AGT GGG ATG ACG 
Ala Tyr Pro He Ser Ser Ser Trp Gly Leu Lys Lys Ser Gly Met Thr 
45 50 55 

TTA ATT GGC TCT GAG CTT CGT CCT CTT AAG GTC ATG TCT TCT GTT TCC 
Leu He Gly Ser Glu Leu Arg Pro Leu Lys Val Met Ser Ser Val Ser 
60 65 70 

ACG GCG TGC ATG C 
Thr Ala Cys Met 
75 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ala Gin Val Ser Arg He Cys Asn Gly Val Gin Asn Pro Ser Leu 
15 10 15 

He Ser Asn Leu Ser Lvs Ser Ser Gin Arg Lys Ser Pro Leu Ser Val 
20 25 30 

Ser Leu Lys Thr Gin Gin His Pro Arg Ala Tyr Pro lie Ser Ser Ser 
35 40 45 

Trp Gly Leu Lys Lys Ser Gly Met Thr Leu He Gly Ser Glu Leu Arg 
50 * 55 60 

~ro Leu Lys Val Met Ser Ser Val Ser Thr Ala Cys Met 
65 ~3 "5 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 402 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

( i i ) MOLECULE^ TYPE : DNA ( genomi c ) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 87.. 401 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

AGATCTATCG ATAAGCTTGA TGTAATTGGA GGAAGATCAA AATTTTCAAT CCCCATTCTT 

CGATTGCTTC AATTGAAGTT TCTCCG ATG GCG CAA GTT AGC AGA ATC TGC AAT 
Met Ala Gin Val Ser Arg lie Cys Asn 



GGT GTG CAG AAC CCA TCT CTT ATC TCC AAT CTC TCG AAA TCC AGT CAA 

Gly Val Gin Asn Pro Ser Leu lie Ser Asn Leu Ser Lys Ser Ser Gin 

10 15 20 25 

CGC AAA TCT CCC TTA TCG GTT TCT CTG AAG ACG CAG CAG CAT CCA CGA 

Arg Lys Ser Pro Leu Ser Val Ser Leu Lys Thr Gin Gin His Pro Arg 



GCT TAT CCG ATT TCG TCG TCG TGG GGA TTG AAG AAG AGT GGG ATG ACG 
Ala Tyr Pro lie Ser Ser Ser Trp Gly Leu Lys Lys Ser Gly Met Thr 
45 50 55 

TTA ATT GGC TCT GAG CTT CGT CCT CTT AAG GTC ATG TCT TCT GTT TCC 
Leu He Gly Ser Glu Leu Ara Pro Leu Lys Val Met Ser Ser Val Ser 
60 65 70 

ACG GCG GAG AAA GCG TCG GAG ATT GTA CTT CAA CCC ATT AGA GAA ATC 
Thr Ala Glu Lys Ala Ser Glu He Val Leu Gin Pro lie Arg Glu He 
75 80 35 

TCC GGT CTT ATT AAG TTG CCT GGC TCC AAG TCT CTA TCA AAT AGA ATT 
Ser Gly Leu lie Lys Leu Pro Gly Ser Lys Ser Leu Ser Asn Arg lie 
30 ?5 100 105 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Ala Gin Val Ser Arg He Cys Asn Gly Val Gin Asn Pro Ser Leu 



He Ser Asn Leu Ser Lys Ser Ser Gin Arg Lys Ser Pro Leu Ser Val 
20 25 30 

Ser Leu Lys Thr Gin Gin His Pro Arg Ala Tyr Pro He Ser Ser Ser 
35 40 45 

Trp Gly Leu Lys Lys Ser Gly Met Thr Leu He Gly Ser Glu Leu Arg 
50 55 60 

Pro Leu Lys Val Met Ser Ser Val Ser Thr Ala Glu Lys Ala Ser Glu 
65 70 75 80 

He Val Leu Gin Pro He Arg Glu He Ser Gly Leu He Lys Leu Pro 

85 90 95 

Gly Ser Lys Ser Leu Ser Asn Arg He 
100 105 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

;A) LENGTH: 23 3 base pairs 
■3) TYPE: nucleic acid 
,C) STRANDEDNESS : double 
<3) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

•A) NAME /KEY: CDS 
B) LOCATION: 14.. 232 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

AGATCTTTCA AGA ATG GCA CAA ATT AAC AAC ATG GCT CAA GGG ATA CAA 
Met Ala Gin He Asn Asn Met Ala Gin Gly He Gin 
15 10 

ACC CTT AAT CCC AAT TCC AAT TTC CAT AAA CCC CAA GTT CCT AAA TCT 
Thr Leu Asn Pro Asn Ser Asn Phe His Lys Pro Gin Val Pro Lys Ser 
15 20 25 

TCA AGT TTT CTT GTT TTT GGA TCT AAA AAA CTG AAA AAT TCA GCA AAT 
Ser Ser Phe Leu Val Phe Gly Ser Lys Lys Leu Lys Asn Ser Ala Asn 

30 35 40 

TCT ATG TTG GTT TTG AAA AAA GAT TCA ATT TTT ATG CAA AAG TTT TGT 
Ser Met Leu Val Leu Lys Lys Asp Ser lie Phe Met Gin Lys Phe Cys 
45 50 55 60 

TCC TTT AGG ATT TCA GCA TCA GTG GCT ACA GCC TGC ATG C 
Ser Phe Arg He Ser Ala Ser Val Ala Thr Ala Cys Met 
65 70 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 amino acids 

(B) TYPE: amino acid 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Ala Gin He Asn Asn Met Ala Gin Gly He Gin Thr Leu Asn Pro 
I 5 10 15 

Asn Ser Asn Phe His Lvs Pro Gin Val Pro Lys Ser Ser Ser Phe Leu 
20 25 30 

Val Phe Gly Ser Lys Lvs Leu Lvs Asn Ser Ala Asn Ser Met Leu Val 
35 40 45 

Leu Lys Lys Asp Ser He Phe Met Gin Lys Phe Cys Ser Fhe Arg He 
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Ser Ala Ser Val Ala Thr Ala Cys Met 
65 70 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 49.. 351 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

AGATCTGCTA GAAATAATTT TGTTTAACTT TAAGAAGGAG ATATATCC ATG GCA CAA 

Met Ala Gin 



ATT AAC AAC ATG GCT CAA GGG ATA CAA ACC CTT AAT CCC AAT TCC AAT 
lie Asn Asn Met Ala Gin Gly He Gin Thr Leu Asn Pro Asn Ser Asn 
5 10 15 

TTC CAT AAA CCC CAA GTT CCT AAA TCT TCA AGT TTT CTT GTT TTT GGA 
?he His Lys Pro Gin Val Pro Lys Ser Ser Ser Phe Leu Val Phe Gly 
20 25 30 35 

TCT AAA AAA CTG AAA AAT TCA GCA AAT TCT ATG TTG GTT TTG AAA AAA 
Ser Lys Lys Leu Lys Asn Ser Ala Asn Ser Met Leu Val Leu Lys Lys 
40 45 50 

GAT TCA ATT TTT ATG CAA AAG TTT TGT TCC TTT AGG ATT TCA GCA TCA 
Asp Ser He Phe Met Gin Lys Phe Cys Ser Phe Arg He Ser Ala Ser 
55 60 65 

GTG GCT ACA GCA CAG AAG CCT TCT GAG ATA GTG TTG CAA CCC ATT AAA 
Val Ala Thr Ala Gin Lys Pro Ser Glu He Val Leu Gin Pro He Lys 



GAG ATT TCA GGC ACT GTT AAA TTG CCT GGC TCT AAA TCA TTA TCT AAT 
Glu He Ser Gly Thr Val Lys Leu Pro Gly Ser Lys Ser Leu Ser Asn 
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AGA ATT C 
Arg lie 
100 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 101 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Ala Gin lie Asn Asn Met Ala Gin Gly lie Gin Thr Leu Asn Pro 
15 10 15 

Asn Ser Asn Phe His Lys Pro Gin Val Pro Lys Ser Ser Ser Phe Leu 
20 25 30 

Val Phe Gly Ser Lys Lys Leu Lys Asn Ser Ala Asn Ser Met Leu Val 
35 40 45 

Leu Lys Lys Asp Ser He Phe Met Gin Lys Phe Cys Ser Phe Arg He 
50 55 60 

Ser Ala Ser Val Ala Thr Ala Gin Lys Pro Ser Glu He Val Leu Gin 
55 70 ~5 SO 

Pro He Lys Glu lie Ser Gly Thr Val Lys Leu Pro Gly Ser Lys Ser 
85 90 95 

Leu Ser Asn Arg He 
100 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 amino acids 
(3) TYPE: amino acid 

C) STRANDEDNESS : single 

D) TOPOLOGY: linear 

lii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

-a His 61y Ala ser Ser Arg Pro Ala Thr Ala Arg Lyg ^ ^ ^ 

10 15 

Leu Xaa Gly Thr Val Arg ll e Pro Gly Asp Lys ^ 



25 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Ala Pro Ser Met He Asp Glu Tyr Pro lie Leu Ala 
5 10 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

lie Thr Gly Leu Leu Glu Gly Glu Asp Val lie Asn Thr Gly Lys 

10 15 
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(2) INFORMATION FOR SEQ ID NO: 21: 



(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULEi TYPE : Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
ATGATHGAYG ARTAYCC 17 
(2) INFORMATION FOR SEQ ID NO: 22: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 : 
GARGAYGTNA THAACAC 

<2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
IB) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other r.ucieic acid 

\A) DESCRIPTION: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GARGAYGTNA THAATAC 
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(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE tYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

CGTGGATAGA TCTAGGAAGA CAACCATGGC TCACGGTC 38 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 25: 
GGATAGATTA AGGAAGACGC GCATGCTTCA CGGTGCAAGC AGCC 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthecic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
GGCTGCCTGA TGAGCTCCAC AATCGCCATC GATGG 



35 
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(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
CGTCGCTCGT CGTGCGTGGC CGCCCTGACG GC 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2S: 
CGGGC AAGGC CATGCAGGCT ATGGGCGCC 29 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CGGGCTGCCG CCTGACTATG GGCCTCGTCG G 



31 
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(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE *TYPE : protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3 0: 

Xaa His Ser Ala Ser Pro Lys Pro Ala Thr Ala Arg Arg Ser Glu 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
GCGGTBGCSG GYTTSGG 
(2) INFORMATION FOR SEQ ID NO: 32: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 15 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Pro Gly Asp Lys Ser lie Ser His Arg Ser Phe Met Phe Gly Gly Leu 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE* CHARACTERISTICS : 

(A! LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
!D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Leu Asp Phe Gly Asn Ala Ala Thr Gly Cys Arg Leu Thr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

{A> DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 34: 

CGGCAATGCC GCCACCGGCG CGCGCC 

(2} INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
i A) LENGTH: 4 9 base pairs 
(3) TYPE: r.ucieic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GGACGGCTGC TTGCACCGTG AAGCATGCTT AAGCTTGGCG TAATCATGG 49 
(2) INFORMATION F©R SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

GGAAGACGCC CAGAATTCAC GGTGCAAGCA GCCGG 35 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 

(A) NAME /KEY: Modified- site 
(3) LOCATION: 2 

O) OTHER INFORMATION: , note= "Xaa at position 2 is Gly, 
Ser, Thr, Cys, Tyr, Asn, Gin, Asp, or Glu" 

(iX) FEATURE: 

(A) NAME / KEY : Modif ied-site 
,3) LOCATION: 4 

D> OTHER INFORMATION: r.ote = "Xaa at position 4 is Ser 
or Thr" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 



Arg Xaa His Xaa Glu 
1 5 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 4 

(D) OTHER INFORMATION : /note- "Xaa at position 4 is Ser 
or Thr" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 
Gly Asp Lys Xaa 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 

\A) NAME /KEY : Modif ied-site 
(B) LOCATION : 4 

O) OTHER INFORMATION: note= "Xaa at cosition 4 is Ala 
Arg, Asn, Asp. Cys , Gin, Glu. Gly, His. He, Leu,' 
-ys. Met. Phe. Fro, 3er, Thr, Trp, ?yr, or Val" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Ser Ala Gin Xaa Lys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 2 

(D) OTHER INFORMATION: /note= "Xaa at position 2 is Ala 
Arg, Asn, Asp, Cys, Gin, Glu, Gly, His, lie, Leu 
Lys, Met, Phe. Pro, Ser, Thr, Trp, Tyr, or Val" 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Asn Xaa Thr Arg 
1 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1287 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE : 

(A) NAME/ KEY: CDS 

(3) LOCATION: 1..1287 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

ATG AAA CGA GAT AAG GTG CAG ACC TTA CAT GGA GAA ATA CAT ATT CCC 
Met Lys Arg Asp Lys Val Gin Thr Leu His Gly Glu He His He Pro 
15 10 15 

GGT GAT AAA TCC ATT TCT CAC CGC TCT GTT ATG TTT GGC GCG CTA GCG 
Gly Asp Lys Ser He Ser His Arg Ser Val Met Phe Gly Ala Leu Ala 
20 25 30 

GCA GGC ACA ACA ACA GTT AAA AAC TTT CTG CCG GGA GCA GAT TGT CTG 
Ala Gly Thr Thr Thr Val Lys Asn Phe Leu Pro Gly Ala Asp Cys Leu 
35 40 45 

AGC ACG ATC GAT TGC TTT AGA AAA ATG GGT GTT CAC ATT GAG CAA AGC 
Ser Thr He Asp Cys Phe Arg Lys Met Gly Val His He Glu Gin Ser 
50 55 60 

AGC AGC GAT GTC GTG ATT CAC GGA AAA GGA ATC GAT GCC CTG AAA GAG 
Ser Ser Asp Val Val He His Gly Lys Gly He Asp Ala Leu Lys Glu 



CCA GAA AGC CTT TTA GAT GTC GGA AAT TCA GGT ACA ACG ATT CGC CTG 
Pro Glu Ser Leu Leu Asp Val Gly Asn Ser Gly Thr Thr He Arg Leu 
85 90 95 

ATG CTC GGA ATA TTG GCG GGC CGT CCT TTT TAC AGC GCG GTA GCC GGA 
Met Leu Gly He Leu Ala Gly Arg Pro Phe Tyr Ser Ala Val Ala Gly 
100 105 HO 

GAT GAG AGC ATT GCG AAA CGC CCA ATG AAG CGT GTG ACT GAG CCT TTG 
Asp Glu Ser He Ala Lys Arg Pro Met Lys Arg Val Thr Glu Pro Leu 
115 120 125 

AAA AAA ATG GGG GCT AAA ATC GAC GGC AGA GCC GGC GGA GAG TTT ACA 
Lys Lys Met Gly Ala Lys He Asp Gly Arg Ala Gly Gly Glu Phe Thr 
130 135 140 

CCG CTG TCA GTG AGC GGC GCT TCA TTA AAA GGA ATT GAT TAT GTA TCA 
Pro Leu Ser Val Ser Gly Ala Ser Leu Lys Gly He Asp Tyr Val Ser 
145 150 155 160 

CCT GTT GCA AGC GCG CAA ATT AAA TCT GCT GTT TTG CTG GCC GGA TTA 
Pro Val Ala Ser Ala Gin He Lys Ser Ala Val Leu Leu Ala Gly Leu 
165 170 175 
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CAG GCT GAG GGC ACA ACA ACT GTA ACA GAG CCC CAT AAA TCT CGG GAC 
Gin Ala Glu Gly Thr Thr Thr Val Thr Glu Pro His Lys Ser Arg Asp 
180 185 190 

CAC ACT GAG CGG ATG CTT TCT GCT TTT GGC GTT AAG CTT TCT GAA GAT 
His Thr Glu Arg Met Leu Ser Ala Phe Gly Val Lys Leu Ser Glu Asp 
195 200 205 

CAA ACG AGT GTT TCC ATT GCT GGT GGC CAG AAA CTG ACA GCT GCT GAT 
Gin Thr Ser Val Ser lie Ala Gly Gly Gin Lys Leu Thr Ala Ala Asp 
210 215 220 

ATT TTT GTT CCT GGA GAC ATT TCT TCA GCC GCG TTT TTC CTT GCT GCT 
lie Phe Val Pro Gly Asp He Ser Ser Ala Ala Phe Phe Leu Ala Ala 
225 230 235 240 

GGC GCG ATG GTT CCA AAC AGC AGA ATT GTA TTG AAA AAC GTA GGT TTA 
Gly Ala Met Val Pro Asn Ser Arg He Val Leu Lys Asn Val Gly Leu 
245 250 255 

AAT CCG ACT CGG ACA GGT ATT ATT GAT GTC CTT CAA AAC ATG GGG GCA 
Asn Pro Thr Arg Thr Gly He He Asp Val Leu Gin Asn Met Gly Ala 
260 265 270 

AAA CTT GAA ATC AAA CCA TCT GCT GAT AGC GGT GCA GAG CCT TAT GGA 
Lys Leu Glu He Lys Pro Ser Ala Asp Ser Gly Ala Glu Pro Tyr Gly 
275 280 285 

GAT TTG ATT ATA GAA ACG TCA TCT CTA AAG GCA GTT GAA ATC GGA GGA 
Asp Leu lie lie Glu Thr Ser Ser Leu Lys Ala Val Glu He Gly Gly 
290 295 300 

GAT ATC ATT CCG CGT TTA ATT GAT GAG ATC CCT ATC ATC GCG CTT CTT 
Asp He lie Pro Arg Leu He Asp Glu He Pro He He Ala Leu Leu 
305 310 315 320 

GCG ACT CAG GCG GAA GGA ACC ACC GTT ATT AAG GAC GCG GCA GAG CTA 
Ala Thr Gin Ala Glu Gly Thr Thr Val He Lys Asp Ala Ala Glu Leu 
325 330 335 

AAA GTG AAA GAA ACA AAC CGT ATT GAT ACT GTT GTT TCT GAG CTT CGC 
Lys Val Lys Glu Thr Asn Arg He Asp Thr Val Val Ser Glu Leu Arg 
340 345 350 

AAG CTG GGT GCT GAA ATT GAA CCG ACA GCA GAT GGA ATG AAG GTT TAT 
Lys Leu Gly Ala Glu He Glu Pro Thr Ala Asp Gly Met Lys Val Tyr 
355 360 - 365 
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GGC AAA CAA ACG TTG AAA GGC GGC GCT GCA GTG TCC AGC CAC GGA GAT 1152 
Gly Lys Gin Thr Leu Lys Gly Gly Ala Ala Val Ser Ser His Gly Asp 
370 375 380 • 

CAT CGA ATC GGA ATG ATG CTT GGT ATT GCT TCC TGT ATA ACG GAG GAG 1200 
His Arg He Gly Met Met Leu Gly He Ala Ser Cys He Thr Glu Glu 
385 390 395 400 



CCG ATT GAA ATC GAG CAC ACG GAT GCC ATT CAC GTT TCT TAT CCA ACC 1248 
Pro He Glu He Glu His Thr Asp Ala He His Val Ser Tyr Pro Thr 
405 410 415 



TTC TTC GAG CAT TTA AAT AAG CTT TCG AAA AAA TCC TGA 
Phe Phe Glu His Leu Asn Lys Leu Ser Lys Lys Ser 
420 425 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Met Lvs Arg Asp Lys Val Gin Thr Leu His Gly Glu He His He Pro 
1 5 10 15 

Gly Asp Lys Ser He Ser His Arg Ser Val Met Phe Gly Ala Leu Ala 
20 25 30 

Ala Gly Thr Thr Thr Val Lys Asn Phe Leu Pro Gly Ala Asp Cys Leu 
35 40 45 

Ser Thr He Asp Cys Phe Arg Lys Met Gly Val His He Glu Gin Ser 

50 55 60 

Ser Ser Asd Val Val He His Gly Lys Gly He Asp Ala Leu Lys Glu 
65 "0 ~5 80 

Pro Glu Ser Leu Leu Asp Val Gly Asn Ser Gly Thr Thr He Arg Leu 
35 90 95 



Met Leu Gly He Leu Ala Gly Arg Pro Phe Tyr Ser Ala Val Ala Gly 
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100 



105 



110 



Asp Glu Ser lie Ala Lys Arg Pro Met Lys Arg Val Thr Glu Pro Leu 
115 120 125 

Lys Lys Met Gly Ala Lys lie Asp Gly Arg Ala Gly Gly Glu Phe Thr 
130 135 140 

Pro Leu Ser Val Ser Gly Ala Ser Leu Lys Gly lie Asp Tyr Val Ser 
145 i 150 155 160 

Pro Val Ala Ser Ala Gin lie Lys Ser Ala Val Leu Leu Ala Gly Leu 
165 170 175 

Gin Ala Glu Gly Thr Thr Thr Val Thr Glu Pro His Lys Ser Arg Asp 
180 185 190 

His Thr Glu Arg Met Leu Ser Ala Phe Gly Val Lys Leu Ser Glu Asp 
195 200 205 

Gin Thr Ser Val Ser lie Ala Gly Gly Gin Lys Leu Thr Ala Ala Asp 
210 215 220 

He Phe Val Pro Gly Asp He Ser Ser Ala Ala Phe Phe Leu Ala Ala 
225 230 235 240 

Gly Ala Met Val Pro Asn Ser Arg He Val Leu Lys Asn Val Gly Leu 
245 250 255 

Asn Pro Thr Arg Thr Gly He He Asp Val Leu Gin Asn Met Gly Ala 
260 265 " 270 

Lys Leu Glu He Lys Pro Ser Ala Asp Ser Gly Ala Glu Pro Tyr Gly 
275 280 285 

Asp Leu He He Glu Thr Ser Ser Leu Lys Ala Val Glu He Gly Gly 
290 295 300 

Asp He He Pro Arg Leu He Asp Glu lie Pro He He Ala Leu Leu 
305 310 315 320 

Ala Thr Gin Ala Glu Gly Thr Thr Val He Lys Asp Ala Ala Glu Leu 
325 330 335 

Lys Val Lys Glu Thr Asn Arg He Asp Thr Val Val Ser Glu Leu Arg 



340 



345 



350 
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Lys Leu Gly Ala Glu lie Glu Pro 
355 360 

Gly Lys Gin Thr Leu Lys Gly Gly 
370 375 

His Arg lie Gly Met Met Leu Gly 
385 390 

Pro He Glu He Glu His Thr Asp 
405 

Phe Phe Glu His Leu Asn Lys Leu 
420 



Thr Ala Asp Gly Met Lys Val Tyr 
365 

Ala Ala Val Ser Ser His Gly Asp 
380 

He Ala Ser Cys He Thr Glu Glu 
395 400 

Ala lie His Val Ser Tyr Pro Thr 
410 415 

Ser Lys Lys Ser 
425 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1293 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(8) LOCATION: 1..1293 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

ATG GTA AAT GAA CAA ATC ATT GAT ATT TCA GGT CCG TTA AAG GGC GAA 
Met Val Asn Glu Gin He lie Asp He Ser Gly Pro Leu Lys Gly Glu 
15 10 15 

ATA GAA GTG CCG GGC GAT AAG TCA ATG ACA CAC CGT GCA ATC ATG TTG 
He Glu Val Pro Gly Asp Lys Ser Met Thr His Arg Ala He Met Leu 
20 " 25 30 

GCG TCG CTA GCT GAA GGT GTA TCT ACT ATA TAT AAG CCA CTA CTT GGC 
Ala Ser Leu Ala Glu Gly Vai ^er Thr He Tyr Lys Pro Leu Leu C-Iy 
35 40 45 
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GAA GAT TGT CGT CGT ACG ATG GAC ATT TTC CGA CAC TTA GGT GTA GAA 
Glu Asp Cys Arg Arg Thr Met Asp lie Phe Arg His Leu GXy Val Glu 
50 55 60 

ATC AAA GAA GAT GAT GAA AAA TTA GTT GTG ACT TCC CCA GGA TAT CAA 
lie Lys Glu Asp Asp Glu Lys Leu Val Val Thr Ser Pro Gly Tyr Gin 

65 70 75 . 80 

GTT AAC ACG CCA CAT CAA GTA TTG TAT ACA GGT AAT TCT GGT ACG ACA 
Val Asn Thr Pro His Gin Val Leu Tyr Thr Gly Asn Ser Gly Thr Thr 
85 90 95 

ACA CGA TTA TTG GCA GGT TTG TTA AGT GGT TTA GGT AAT GAA AGT GTT 

Thr Arg Leu Leu Ala Gly Leu Leu Ser Gly Leu Gly Asn Glu Ser Val 
100 105 110 

TTG TCT GGC GAT GTT TCA ATT GGT AAA AGG CCA ATG GAT CGT GTC TTG 
Leu Ser Gly Asp Val Ser lie Gly Lys Arg Fro Met Asp Arg Val Leu 
115 120 125 

AGA CCA TTG AAA CTT ATG GAT GCG AAT ATT GAA GGT ATT GAA GAT AAT 
Arg Pro Leu Lys Leu Met Asp Ala Asn lie Glu Gly lie Glu Asp Asn 
130 135 140 

TAT ACA CCA TTA ATT ATT AAG CCA TCT GTC ATA AAA GGT ATA AAT TAT 
Tyr Thr Pro Leu lie lie Lys Pro Ser Val lie Lys Gly lie Asn Tyr 
145 150 155 160 

CAA ATG GAA GTT GCA AGT GCA CAA GTA AAA AGT GCC ATT TTA TTT GCA 
Gin Met Glu Val Ala Ser Ala Gin Val Lys Ser Ala lie Leu Phe Ala 
165 170 175 

AGT TTG TTT TCT AAG GAA CCG ACC ATC ATT AAA GAA TTA GAT GTA AGT 
Ser Leu Phe Ser Lys Glu Pro Thr lie He Lys Glu Leu Asp Val Ser 
180 185 190 

CGA AAT CAT ACT GAG ACG • ATG TTC AAA CAT TTT AAT ATT CCA ATT GAA 
Arg Asn His Thr Glu Thr Met Phe Lys His Phe Asn He Pro He Glu 
195 200 205 

GCA GAA GGG TTA TCA ATT AAT ACA ACC CCT GAA GCA ATT CGA TAC ATT 
Ala Glu Giv Leu Ser He Asn Thr Thr Fro Glu Ala He Arg Tyr He 

:io :i5 :20 

.AAA CCT GCA GAT TTT CAT GTT CCT C-GC GAT ATT TCA TCT GCA GCG TTC 
Lys Pro Ala Asp Phe His Val Pro Gly Asp He Ser Ser Ala Ala Phe 
225 ' 230 235 240 
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TTT ATT GTT GCA GCA CTT ATC ACA CCA GGA AGT GAT GTA ACA ATT CAT 
Phe He Val Ala Ala Leu lie Thr Pro Gly Ser Asp Val Thr He His 
245 250 255 

AAT GTT GGA ATC AAT CAA ACA CGT TCA GGT ATT ATT GAT ATT GTT GAA 
Asn Val Gly He Asn Gin Thr Arg Ser Gly He He Asp He Val Glu 
260 265 270 

AAA ATG GGC GGT AAT ATC CAA CTT TTC AAT CAA ACA ACT GGT GCT GAA 
Lys Met Gly Gly Asn He Gin Leu Phe Asn Gin Thr Thr Gly Ala Glu 
275 280 285 

CCT ACT GCT TCT ATT CGT ATT CAA TAC ACA CCA ATG CTT CAA CCA ATA 
Pro Thr Ala Ser He Arg He Gin Tyr Thr Pro Met Leu Gin Pro He 
290 295 300 

ACA ATC GAA GGA GAA TTA GTT CCA AAA GCA ATT GAT GAA CTG CCT GTA 
Thr He Glu Gly Glu Leu Val Pro Lys Ala He Asp Glu Leu Pro Val 
305 310 315 320 

ATA GCA TTA CTT TGT ACA CAA GCA GTT GGC ACG AGT ACA ATT AAA GAT 
lie Ala Leu Leu Cys Thr Gin Ala Val Gly Thr Ser Thr He Lys Asp 
325 330 335 

GCC GAG GAA TTA AAA GTA AAA GAA ACA AAT AGA ATT GAT ACA ACG GCT 
Ala Glu Glu Leu Lys Val Lys Glu Thr Asn Arg He Asp Thr Thr Ala 
340 345 350 

GAT ATG TTA AAC TTG TTA GGG TTT GAA TTA CAA CCA ACT AAT GAT GGA 
Asp Met Leu Asn Leu Leu Gly Phe Glu Leu Gin Pro Thr Asn Asp Gly 
355 360 365 

TTG ATT ATT CAT CCG TCA GAA TTT AAA ACA AAT GCA ACA GAT ATT TTA 
Leu lie He His Pro Ser Glu Phe Lys Thr Asn Ala Thr Asp He Leu 
370 375 380 

ACT GAT CAT CGA ATA GGA ATG ATG CTT GCA GTT GCT TGT GTA CTT TCA 
Thr Asp His Arg He Gly Met Met Leu Ala Val Ala Cys Val Leu Ser 



385 390 



400 



AGC GAG CCT GTC AAA ATC AAA CAA TTT GAT GCT GTA AAT GTA TCA TTT 
Zer Glu Pro Val Lys lie Lys Gin Phe Asp Ala Val Asn Val Ser Phe 
405 410 415 

CCA GGA TTT TTA CCA AAA CTA AAG CTT TTA CAA AAT GAG GGA TAA 
Pro Gly Phe Leu Fro Lys Leu Lys Leu Leu Gin Asn Glu Gly 
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(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 430 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44; 

Met Val Asn Glu Gin lie He Asp He Ser Gly Pro Leu Lys Gly Glu 

15 10 15 

He Glu Val Pro Gly Asp Lys Ser Met Thr His Arg Ala He Met Leu 
20 25 30 

Ala Ser Leu Ala Glu Gly Val Ser Thr He Tyr Lys Pro Leu Leu Gly 
35 40 45 

Glu Asp Cys Arg Arg Thr Met Asp He Phe Arg His Leu Gly Val Glu 
50 55 60 

lie Lys Glu Asp Asp Glu Lys Leu Val Val Thr Ser Pro Gly Tyr Gin 
65 70 75 80 

Val Asn Thr Pro His Gin Val Leu Tyr Thr Gly Asn Ser Gly Thr Thr 

85 90 ?5 

Thr Arg Leu Leu Ala Gly Leu Leu Ser Gly Leu Gly Asn Glu Ser Val 
100 105 110 

Leu Ser Gly Asp Val Ser He Gly Lys Arg Pro Met Asp Arg Val Leu 
115 120 125 

Arg Pro Leu Lys Leu Met Asp Ala Asn He Glu Gly He Glu Asp Asn 
130 135 140 

^yr Thr Pro Leu He He Lys Pro Ser Val He Lys Gly He Asn Tyr 
145 150 155 160 

"-In Met Glu Val Ala Ser Ala Gin Val Lvs Ser Ala He Leu Phe Ala 
165 170 1-5 

Ser Leu Phe Ser Lys Glu Pro Thr He He Lys Glu Leu Asp Val Ser 



-148- 



38-21(10660)A 



Arg Asn His Thr Glu Thr Met Phe Lys His Phe Asn He Pro He Glu 
195 200 205 

Ala Glu Gly Leu Ser He Asn Thr Thr Pro Glu Ala lie Arg Tyr He 
210 215 220 

Lys Pro Ala Asp Phe His Val Pro Gly Asp He Ser Ser Ala Ala Phe 
225 230 235 240 

Phe He Val Ala Ala Leu He Thr Pro Gly Ser Asp Val Thr lie His 
245 250 255 

Asn Val Gly He Asn Gin Thr Arg Ser Gly He He Asp He Val Glu 

260 265 270 

Lys Met Gly Gly Asn He Gin Leu Phe Asn Gin Thr Thr Gly Ala Glu 
275 280 285 

Pro Thr Ala Ser He Arg lie Gin Tyr Thr Pro Met Leu Gin Pro He 
290 295 300 

Thr lie Glu Gly Glu Leu Val Pro Lys Ala He Asp Glu Leu Pro Val 
305 310 315 320 

lie Ala Leu Leu Cys Thr Gin Ala Val Gly Thr Ser Thr lie Lys Asp 
325 330 335 

Ala Glu Glu Leu Lys Val Lys Glu Thr Asn Arg He Asp Thr Thr Ala 

340 345 350 

Asp Met Leu Asn Leu Leu Gly Phe Glu Leu Gin Pro Thr Asn Asp Gly 
355 360 365 

Leu lie He His Pro Ser Glu Phe Lys Thr Asn Ala Thr Asp 
370 375 380 

Thr Asp His Arg He Gly Met Met Leu Ala Val Ala Cys Val 
385 390 395 

Ser Glu Pro Val Lys He Lys Gin Phe Asp Ala Val Asn Val 



He Leu 

Leu Ser 
400 

Ser Fhe 
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(2) INFORMATION FOR SEQ ID NO: 45: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
GGAACATATG AAACGAGATA AGGTGCAG 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 
GGAATTCAAA CTTCAGGATC TTGAGATAGA AAATG 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
ID) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Other nucleic acid 

iA) DESCRIPTION:* Synthetic DNA 

txi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
GGGGCCATGG TAAATGAACA AATCATTG 
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(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 

GGGGGAGCTC ATTATCCCTC ATTTTGTAAA AGC 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 480 amino acids 
IB) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO.-49: 

Leu Thr Asp Glu Thr Leu Val Tyr Pro = he Lys Asp He Pro Ala Asp 
15 i: 15 

Gin Gin Lys Val Val He Pro Pro Gly Ser Lys Ser lie Ser Asn Arg 
20 25 30 

Ala Leu He Leu Ala Ala Leu Gly Glu Gly Gin Cys Lys He Lys Asn 



Leu Leu His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu 
50 55 50 

Leu Lys Gly Ala Thr He Ser Trp Glu Asp Asn Gly Glu Thr Val Val 
65 ~0 ^5 30 

Val Glu Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu 



-151- 



38-21(10660)A 



Tyr Leu Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala 
100 105 110 

Ala Leu Val Asn Ser Thr Ser Ser Gin Lys Tyr lie Val Leu Thr Gly 
115 120 125 

Asn Ala Arg Met Gin Gin Arg Pro He Ala Pro Leu Val Asp Ser Leu 
130 135 140 

Arg Ala Asn Gly Thr Lys He Glu Tyr .Leu Asn Asn Glu Gly Ser Leu 
145 150 155 160 

Pro He Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg He Glu 

165 170 175 

Leu Ala Ala Thr Val Ser Ser Gin Tyr Val Ser Ser He Leu Met Cys 
180 185 190 

Ala Pro Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys 
195 200 205 

Pro He Ser Lys Leu Tyr Val Asp Met Thr He Lys Met Met Glu Lys 
210 215 220 

Phe Gly He Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr 
225 230 235 240 

He Pro Lys Gly His Tyr He Asn Pro Ser Glu Tyr Val He Glu Ser 
245 ' 250 255 

Asp Ala Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly 



Thr Thr Val Thr Val Pro Asn He Gly Phe Glu Ser Leu Gin Gly Asp 
275 280 285 

Ala Arg Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys He Thr 
290 295 200 

Gin Thr Aia Thr Ser Thr Thr Val Ser Gly Pro Fro Val Gly Thr Leu 



Lys Fro Leu Lvs .-lis Val Aso Met Glu Pro Met Thr Asp Ala Phe Leu 

:25 330 335_ 

Thr Ala Cys Val Val Ala Ala He Ser His Asp Ser Asp Pro Asn Ser 

540 345 350 
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Ala Asn Thr Thr Thr lie Glu Gly He Ala Asn Gin Arg Val Lys Glu 
355 360 365 

Cys Asn Arg He Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val 
370 375 380 

Lys Thr Thr Glu Leu Pro Asp Gly He Gin Val His Gly Leu Asn Ser 
385 390 395 400 

He Lys Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val 
i 405 410 415 

Cys Thr Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala 
420 425 430 

Gly Met Val Asn Ser Gin Asn Glu Arg Asp Glu Val Ala Asn Pro Val 
435 440 445 

Arg He Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp 
450 455 460 

Asp Val Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro 
465 470 475 480 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 460 amino acids 

( B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Leu Ala Pro Ser He Glu Val His Pro Gly Val Ala His Ser Ser Asn 



Val He Cvs Ala Pro Fro Gly Ser Lys Ser He Ser Asn Arg Ala Leu 
20 25 30 

Val Leu Ala Ala Leu Gly Ser Gly Thr Cys Arg He Lys Asn Leu Leu 
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His Ser Asp Asp Thr Glu Val Met Leu Asn Ala Leu Glu Arg Leu Gly 
50 55 60 

Ala Ala Thr Phe Ser Trp Glu Glu Glu Gly Glu Val Leu Val Val Asn 
65 70 75 80 

Gly Lys Gly Gly Asn Leu Gin Ala Ser Ser Ser Pro Leu Tyr Leu Gly 
85 90 95 

Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Thr Val Ala Thr Leu Ala 
100 105 110 

Asn Ser Ser Thr Val Asp Ser Ser Val Leu Thr Gly Asn Asn Arg Met 
115 120 125 

Lys Gin Arg Pro lie Gly Asp Leu Val Asp Ala Leu Thr Ala Asn Val 
130 135 140 

Leu Pro Leu Asn Thr Ser Lys Gly Arg Ala Ser Leu Pro Leu Lys lie 
145 150 155 160 

Ala Ala Ser Gly Gly Phe Ala Gly Gly Asn lie Asn Leu Ala Ala Lys 
165 170 175 

Val Ser Ser Gin Tyr Val Ser Ser Leu Leu Met Cys Ala Pro Tyr Ala 
180 185 190 

Lys Glu Pro Val Thr Leu Arg Leu Val Gly Gly Lys Pro He Ser Gin 
195 200 205 

Pro Tyr He Asp Met Thr Thr Ala Met Met Arg Ser Phe Gly He Asp 
210 215 220 

Val Gin Lvs Ser Thr Thr Glu Glu His Thr Tyr His He Pro Gin Gly 
225 " 230 235 240 

Arg Tyr Val Asn Pro Ala Glu Tvr Val He Glu Ser Asp Ala Ser Cys 
245 250 255 

Ala Thr '^vr Pro Leu Ala Val Ala Ala Val Thr Gly Thr Thr Cys Thr 
260 265 270 

Val Pro Asn He Glv Ser Ala Ser Leu Gin Gly Asp Ala Arc Phe Ala 
275 230 235 

Val Glu Val Leu Ara Pro Met Gly Cys Thr Val Glu Gin Thr Glu Thr 
290 295 300 
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Ser Thr Thr Val Thr Gly Pro Ser Asp Gly lie Leu Arg Ala Thr Ser 
305 310 315 320 

Lys Arg Gly Tyr Gly Thr Asn Asp Arg Cys Val Pro Arg Cys Phe Arg 
325 330 335 

Thr Gly Ser His Arg Pro Met Glu Lys Ser Gin Thr Thr Pro Pro Val 
340 345 350 

Ser Ser Gly He Ala Asn Gin Arg Val Lys Glu Cys Asn Arg He Lys 
355 360 365 

Ala Met Lys Asp Glu Leu Ala Lys Phe Gly Val He Cys Arg Glu His 

370 275 380 

Asp Asp Gly Leu Glu He Asp Gly He Asp Arg Ser Asn Leu Arg Gin 
385 390 395 400 

Pro Val Gly Gly Val Phe Cys Tyr Asp Asp His Arg Val Ala Phe Ser 
405 410 415 

Phe Ser Val Leu Ser Leu Val Thr Pro Gin Pro Thr Leu He Leu Glu 
420 425 430 

Lys Glu Cys Val Gly Lys Thr Trp Pro Gly Trp Trp Asp Thr Leu Arg 
435 440 445 

Gin Leu Phe Lys Val Lys Leu Glu Gly Lys Glu Leu 
450 455 460 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 444 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 1 : 

Lys Aia Ser Glu lie Val Leu Gin Pro He Arg Glu He Ser Gly Leu 
5 10 15 
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He Lys Leu Pro Gly Ser Lys Ser Leu Ser Asn Arg lie Leu Leu Leu 
20 25 30 



Ala Ala Leu Ser Glu Gly Thr Thr Val Val Asp Asn Leu Leu Asn Ser 
35 40 45 



Asp Asp He Asn Tyr Met Leu Asp Ala Leu Lys Lys Leu Gly Leu Asn 
50 . 55 60 



Val Glu Arg Asp Ser Val Asn Asn Arg Ala Val Val Glu Gly Cys Gly 
65 70 75 80 



Gly He Phe Pro Ala Ser Leu Asp Ser Lys Ser Asp He Glu Leu Tyr 
35 ?0 95 



Leu Gly Asn Ala Gly Thr Ala Met Arg Pro Leu Thr Ala Ala Val Thr 
100 105 110 



Ala Ala Gly Gly Asn Ala Ser Tyr Val Leu Asp Gly Val Pro Arg Met 
115 120 125 



Arg Glu Arg Pro lie Gly Asp Leu Val Val Gly Leu Lys Gin Leu Gly 
130 135 140 



Ala Asp Val Glu Cys Thr Leu Gly Thr Asn Cys Pro Pro Val Arg Val 
145 150 155 160 



Asn Ala Asn Gly Gly Leu Pro Gly Gly Lys Val Lys Leu Ser Gly Ser 
155 * 170 175 



He Ser Ser Gin Tyr Leu Thr Ala Leu Leu Met Ala Ala Pro Leu Ala 
180 185 190 



Leu Gly Asp Val Glu He Glu He He Asp Lys Leu He Ser Val Pro 
195 200 205 



Tyr Val Glu Met Thr Leu Lys Leu Met Glu Arg Phe Gly Val Ser Ala 
210 215 220 



Glu Kis Ser Asp Ser Trp Asp Arg Phe Phe Val Lys Gly Gly Gin Lys 
225 230 ' 235 240 



Tyr Lys Ser Pro Gly Asn Ala Tyr Vai Glu Gly Asp Ala Ser Ser Ala 



-156- 



38-21(10660)A 



Glu Gly Cys Gly Thr Thr Ser Leu Gin Gly Asp Val Lys Phe Ala Glu 
275 280 285 

Val Leu Glu Lys Met Gly Cys Lys Val Ser Trp Thr Glu Asn Ser Val 
290 295 300 

Thr Val Thr Gly Pro Ser Arg Asp Ala Phe Gly Met Arg His Leu Arg 
305 i 310 315 320 

Ala Val Asp Val Asn Met Asn Lys Met Pro Asp Val Ala Met Thr Leu 
325 330 335 

Ala Val Val Ala Leu Phe Ala Asp Gly Pro Thr Thr lie Arg Asp Val 

340 345 350 

Ala Ser Trp Arg Val Lys Glu Thr Glu Arg Met He Ala He Cys Thr 
355 360 365 

Glu Leu Arg Lys Leu Gly Ala Thr Val Glu Glu Gly Ser Asp Tyr Cys 
370 375 380 

Val He Thr Pro Pro Ala Lys Val Lys Pro Ala Glu He Asp Thr Tyr 
385 390 395 400 

Asp Asp His Arg Met Ala Met Ala Phe Ser Leu Ala Ala Cys Ala Asp 
405 410 415 

Val Pro Val Thr He Lys Asp Pro Gly Cys Thr Arg Lys Thr Phe Pro 
420 425 430 

Asp Tyr Phe Gin Val Lau Glu Ser He Thr Lys His 
435 440 

INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 444 ammo acids 

(B) TYPE: amino acid 
■D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Lys Ala Ser Glu He Val Leu Gin Pro He Arg Glu He Ser Gly Leu 
15 10 is 

He Lys Leu Pro Gly Ser Lys Ser Leu Ser Asn Arg He Leu Leu Leu 
20 25 30 

Ala Ala Leu Ser Glu Gly Thr Thr Val Val Asp Asn Leu Leu Asn Ser 
35 40 45 

Asp Asp He Asn Tyr Met Leu Asp Ala Leu Lys Arg Leu Gly Leu Asn 
50 55 60 

Val Glu Thr Asp Ser Glu Asn Asn Arg Ala Val Val Glu Gly Cys Gly 
65 70 75 80 

Gly He Phe Pro Ala Ser He Asp Ser Lys Ser Asp He Glu Leu Tyr 
85 90 95 

Leu Gly Asn Ala Gly Thr Ala Met Arg Pro Leu Thr Ala Ala Val Thr 
100 105 110 

Ala Ala Gly Gly Asn Ala Ser Tyr Val Leu Asp Gly Val Pro Arg Met 
115 120 125 

Arg Glu Arg Pro He Gly Asp Leu Val Val Gly Leu Lys Gin Leu Gly 
130 135 140 

Ala Asp Val Glu Cys Thr Leu Gly Thr Asn Cys Pro Pro Val Arg Val 
145 150 155 160 

Asn Ala Asn Gly Gly Leu Pro Gly Gly Lys Val Lys Leu Ser Gly Ser 
165 170 175 

lie Ser Ser Gin Tyr Leu Thr Ala Leu Leu Met Ser Ala Pro Leu Ala 
180 185 190 

Leu Gly Asp Val Glu lie Glu He Val Asp Lys Leu He Ser Val Pro 
195 200 , 205 

Tvr Val Glu Met Thr Leu Lvs Leu Met Glu Arg Phe Gly Val Ser Val 
210 215 220 



Glu His Ser Asp Ser Trp Asp Arg Phe Phe Val Lys Gly Gly Gin Lys 



-158- 



38-21(10660)A 



Tyr Lys Ser Pro Gly Asn Ala Tyr Val Glu Gly Asp Ala Ser Ser Ala 
245 250 255 

Cys Tyr Phe Leu Ala Gly Ala Ala He Thr Gly Glu Thr Val Thr Val 
260 2S5 270 

Glu Gly Cys Gly Thr Thr Ser Leu Gin Gly Asp Val Lys Phe Ala Glu 
275 280 285 

Val Leu Glu Lys Met Gly Cys Lys Val Ser Trp Thr Glu Asn Ser Val 
290 295 300 

Thr Val Thr Gly Pro Pro Arg Asp Ala Phe Gly Met Arg His Leu Arg 
305 310 315 320 

Ala He Asp Val Asn Met Asn Lys Met Pro Asp Val Ala Met Thr Leu 
325 330 335 

Ala Val Val Ala Leu Phe Ala Asp Gly Pro Thr Thr He Arg Asp Val 
340 345 350 

Ala Ser Trp Arg Val Lys Glu Thr Glu Arg Met He Ala He Cys Thr 
355 360 365 

Glu Leu Arg Lys Leu Gly Ala Thr Val Glu Glu Gly Ser Asp Tyr Cys 
370 375 380 

Val He Thr Pro Pro Lys Lys Val Lys Thr Ala Glu He Asp Thr Tyr 
385 390 395 400 

Asp Asp His Arg Met Ala Met Ala Phe Ser Leu Ala Ala Cys Ala Asp 
405 410 415 

Val Pro He Thr lie Asn Asp Ser Gly Cys Thr Arg Lys Thr Phe Pro 
420 425 430 

Asp Tyr Phe Gin Val Leu Glu Arg He Thr Lys His 
435 440 

INFORMATION FOR SEQ ID NO: 53: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 44 amino acids 
i,B) TYPE: ammo acid 
,DI TOFOLOGY: -inear 

<ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Lys Pro Asn Glu lie Val Leu Gin Pro lie Lys Asp lie Ser Gly Thr 
15 10 15 

Val Lys Leu Pro Gly Ser Lys Ser Leu Ser Asn Arg lie Leu Leu Leu 
20 25 30 

Ala Ala Leu Ser Lys Gly Arg Thr Val Val Asp Asn Leu Leu Ser Ser 
35 40 45 

Asp Asp lie His Tyr Met Leu Gly Ala Leu Lys Thr Leu Gly Leu His 

50 E5 50 

Val Glu Asp Asp Asn Glu Asn Gin Arg Ala lie Val Glu Gly Cys Gly 
65 70 75 80 

Gly Gin Phe Pro Val Gly Lys Lys Ser Glu Glu Glu lie Gin Leu Phe 
85 90 95 

Leu Gly Asn Ala Gly Thr Ala Met Arg Pro Leu Thr Ala Ala Val Thr 
100 105 110 

Val Ala Gly Gly His Ser Arg Tyr Val Leu Asp Gly Val Pro Arg Met 
115 120 125 

Arg Glu Arg Pro He Gly Asp Leu Val Asp Gly Leu Lys Gin Leu Gly 
130 135 140 

Ala Glu Val Asd Cvs Phe Leu Gly Thr Asn Cys Pro Pro Val Arg He 
145 150 155 160 

Val Ser Lvs Gly Glv Leu Pro Gly Gly Lys Val Lys Leu Ser Gly Ser 
165 170 175 

lie Ser Ser Gin Tyr Leu Thr Ala Leu Leu Met Ala Ala Pro Leu Ala 



180 



135 



190 



Leu Gly 



Asp Val 
195 



He Glu 



lie lie Asp Lys 
200 



He Ser Val Pro 
205 



Tyr 




Met 




Met Hu Arg Phe Gly Val Ser Val 
220 



Glu His 
225 



ler Trp Asp 
230 



Lys Fhe Leu Val Arg Gly Gly Gin Lys 
235 240 
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Tyr Lys Ser Pro Gly Lys Ala Tyr Val Glu Gly Asp Ala Ser Ser Ala 
245 250 255 

Ser Tyr Phe Leu Ala Gly Ala Ala Val Thr Gly Gly Thr Val Thr Val 
260 265 270 

Glu Gly Cys Gly Thr Ser Ser Leu Gin Gly Asp Val Lys Phe Ala Glu 
275 280 285 

Val Leu Glu Lys Met Gly Ala Glu Val Thr Trp Thr Glu Asn Ser Val 
290 295 300 

Thr val Lys Gly Pro Pro Arg Asn Ser Ser Gly Met Lys His Leu Arg 
205 310 315 320 

Ala Val Asp Val Asn Met Asn Lys Met Pro Asp Val Ala Met Thr Leu 
325 330 335 

Ala Val Val Ala Leu Phe Ala Asp Gly Pro Thr Ala He Arg Asp Val 
340 345 350 

Ala Ser Trp Arg Val Lys Glu Thr Glu Arg Met He Ala He Cys Thr 
355 360 365 

Glu Leu Arg Lys Leu Gly Ala Thr Val Val Glu Gly Ser Asp Tyr Cys 
370 375 380 

lie He Thr Pro Pro Glu Lys Leu Asn Val Thr Glu He Asp Thr Tyr 
335 390 395 400 

Asp Aso His Arg Met Ala Met Ala Phe Ser Leu Ala Ala Cys Ala Asp 
405 410 415 

Val Pro Val Thr He Lys Asp Pro Gly Cys Thr Arg Lys Thr Phe Pro 
420 425 430 

Asn Tyr Phe Asp Val Leu Gin Gin Tyr Ser Lys His 
435 440 

) INFORMATION FOR SEQ ID NO: 54: 

1) SEQUENCE CHARACTERISTICS: 

• A) LENGTH: 444 ammo acids 



O) TOPOLOGY: linear 
ill MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Lys Pro His Glu lie Val Leu Xaa Pro lie Lys Asp lie Ser Gly Thr 
1 5 ' 10 15 

Val Lys Leu Pro Gly Ser Lys Ser Leu Ser Asn Arg lie Leu Leu Leu 
^0 25 30 

Ala Ala Leu Ser Glu Gly Arg Thr Val Val Asp Asn Leu Leu Ser Ser 
35 40 45 

Asp Asp He His Tyr Met Leu Gly Ala Leu Lys Thr Leu Gly Leu His 
50 55 50 

Val Glu Asp Asp Asn Glu Asn Gin Arg Ala He Val Glu Gly Cys Gly 
65 70 75 80 

Gly Gin Phe Pro Val Gly Lys Lys Ser Glu Glu Glu He Gin Leu Phe 
85 90 95 

Leu Gly Asn Ala Gly Thr Ala Met Arg Pro Leu Thr Ala Ala Val Thr 
100 105 110 

Val Ala Gly Gly His Ser Arg Tyr Val Leu Asp Gly Val Pro Arg Met 
115 120 125 

Arg Glu Arg Pro He Gly Asp Leu Val Asp Gly Leu Lys Gin Leu Gly 
130 135 140 

Ala Glu Val Asp Cys Ser Leu Gly Thr Asn Cys Fro Fro Val Arg lie 
145 150 155 160 

Val Ser Lys Gly Gly Leu Pro Gly Gly Lys Val Lys Leu Ser Gly Ser 
165 170 175 

He Ser Ser Gin Tyr Leu Thr Ala Leu Leu Met Ala Ala Pro Leu Ala 



Leu Gly Asp Val Glu lie Glu He He Asp Lys Leu He Ser Val Pro 
195 200 205 

Tyr Val Glu Met Thr Leu Lys Leu Met Glu Arg Phe Gly Val Phe Val 
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Glu His Ser Ser Gly Trp Asp Arg Phe Leu Val Lys Gly Gly Gin Lys 
225 230 235 240 

Tyr Lys Ser Pro Gly Lys Ala Phe Val Glu Gly Asp Ala Ser Ser Ala 
245 250 255 

Ser Tyr Phe Leu Ala Gly Ala Ala Val Thr Gly Gly Thr Val Thr Val 
i 260 265 270 

Glu Gly Cys Gly Thr Ser Ser Leu Gin Gly Asp Val Lys Phe Ala Glu 
275 280 285 

Val Leu Glu Lys Met Gly Ala Glu Val Thr Trp Thr Glu Asn Ser Val 
290 295 300 

Thr Val Lys Gly Pro Pro Arg Asn Ser Ser Gly Met Lys His Leu Arg 
305 310 315 320 

Ala lie Asp Val Asn Met Asn Lys Met Pro Asp Val Ala Met Thr Leu 
325 330 335 

Ala Val Val Ala Leu Phe Ala Asp Gly Pro Thr Thr lie Arg Asp Val 
340 345 350 

Ala Ser Trp Arg Val Lys Glu Thr Glu Arg Met He Ala He Cys Thr 
355 360 365 

Glu Leu Arg Lys Leu Gly Ala Thr Val Val Glu Gly Ser Asp Tyr Cys 

370 375 380 

He He Thr Pro Pro Glu Lys Leu Asn Val Thr Glu He Asp Thr Tyr 
385 390 395 400 

Asp Asp His Arg Met Ala Met Ala Phe Ser Leu Ala Ala Cys Ala Asp 
405 410 415 

Val Pro Val Thr He Lys Asn Pro Gly Cys Thr Arg Lys Thr Phe Pro 
420 425 ' 430 



Asp Tyr Phe Glu Val Leu Gin Lys Tyr Ser Lys His 
435 440 
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(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 444 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Lys Pro Ser Glu He Val Leu Gin Pro He Lys Glu He Ser Gly Thr 
15 10 15 

Val Lys Leu Pro Gly Ser Lys Ser Leu Ser Asn Arg He Leu Leu Leu 
20 25 30 

Ala Ala Leu Ser Glu Gly Thr Thr Val Val Asp Asn Leu Leu Ser Ser 
35 40 ' 45 

Asp Asp lie His Tyr Met Leu Gly Ala Leu Lys Thr Leu Gly Leu His 
50 55 60 

Val Glu Glu Asp Ser Ala Asn Gin Arg Ala Val Val Glu Gly Cys Gly 
65 70 75 80 

Gly Leu Phe Pro Val Gly Lys Glu Ser Lys Glu Glu He Gin Leu Phe 



Leu Gly Asn Ala Gly Thr Ala Met Arg Pro Leu Thr Ala Ala Val Thr 
100 105 110 

Val Ala Gly Gly Asn Ser Arg Tvr Val Leu Asp Gly Val Pro Arg Met 
115 120 125 

Arg Glu Arg Pro He Ser Asp Leu Val Asp Gly Leu Lys Gin Leu Gly 
13 0 135 140 

Ala Glu Val Asp Cys Phe Leu Gly Thr Lys Cys Fro Pro Val Arg He 
145 150 155 160 

Val ^er Lvs Gly Glv Leu Pro GIv Glv Lvs Val Lys Leu Ser Gly Ser 
155 " 170 175 
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He Ser Ser Gin Tyr Leu Thr Ala Leu Leu Met Ala Ala Pro Leu Ala 
180 185 190 

Leu Gly Asp Val Glu He Glu He He Asp Lys Leu He Ser Val Pro 
195 200 205 

Tyr Val Glu Met Thr Leu Lys Leu Met Glu Arg Phe Gly He Ser Val 
210 215 220 

Glu His Ser Ser Ser Trp Asp Arg Phe Phe Val Arg Gly Gly Gin Lys 
225 230 235 240 

Tyr Lys Ser Pro Gly Lys Ala Phe Val Glu Gly Asp Ala Ser Ser Ala 
245 250 255 

Ser Tyr Phe Leu Ala Gly Ala Ala Val Thr Gly Gly Thr He Thr Val 
260 265 270 

Glu Gly Cys Gly Thr Asn Ser Leu Gin Gly Asp Val Lys Phe Ala Glu 
275 280 285 

Val Leu Glu Lys Met Gly Ala Glu Val Thr Trp Thr Glu Asn Ser Val 
290 295 300 

Thr Val Lys Gly Pro Pro Arg Ser Ser Ser Gly Arg Lys His Leu Arg 
305 310 315 320 

Ala lie Asp Val Asn Met Asn Lvs Met Pro Asp Val Ala Met Thr Leu 

325 222 335 

Ala val Val Ala Leu Tyr Ala Asp Gly Fro Thr Ala He Arg Asp Val 
340 345 350 

Ala Ser Trp Arg Val Lys Glu Thr Glu Arg Met He Ala He Cys Thr 
355 360 355 

Glu Leu Arg Lys Leu Gly Ala Thr Val Glu Glu Gly Pro Asp Tyr Cys 
270 375 380 

He *"e Thr Pro Pro Glu Lvs Leu Asn Val Thr Asp He Asp Thr Tyr 
385 390 395 400 

Ast> A so His Arg Met Ala Met Ala Phe Ser Leu Ala Ala Cys Ala Asp 
405 410 415 



Arg Lys Thr Phe Pro 
430 
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Asn Tyr Phe Asp Val Leu Gin Gin Tyr Ser Lys His 
435 440 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 444 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: 

Ala Gly Ala Glu Glu He Val Leu Gin Pro lie Lys Glu lie Ser Gly 
15 10 15 

Thr Val Lys Leu Pro Gly Ser Lys Ser Leu Ser Asn Arg He Leu Leu 
20 25 30 

Leu Ala Ala Leu Ser Glu Gly Thr Thr Val Val Asp Asn Leu Leu Asn 
35 40 45 

Ser Glu Asp Val His Tyr Met Leu Gly Ala Leu Arg Thr Leu Gly Leu 
50 55 60 

Ser Val Glu Ala Asp Lys Ala Ala Lys Arg Ala Val Val Val Gly Cys 
65 70 75 30 

Gly Gly Lys Phe Pro Val Glu Asp Ala Lys Glu Glu Val Gin Leu Phe 
85 90 95 . 

Leu Gly Asn Ala Gly Thr Ala Met Arg Pro Leu Thr Ala Ala Val Thr 

100 105 HO 

Ala Ala Glv Gly Asn Ala Thr Tvr Val Leu Asp Gly Val Pro Arg Met 
115 120 125 

Arg Glu Arg Pro lie Gly Asp Leu Val Va-1 Gly Leu Lys Gin Leu Gly 
130 135 140 



Ala Asp Val Asp Cys Phe Leu Gly Thr Asp Cys Pro Pro Val Arg Val 
145 150 155 160 
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Asn Gly lie Gly Gly Leu Pro Gly Gly Lys Val Lys Leu Ser Gly Ser 
165 170 175 

lie Ser Ser Gin Tyr Leu Ser Ala Leu Leu Met Ala Ala Pro Leu Pro 
180 185 190 

Leu Gly Asp Val Glu lie Glu lie He Asp Lys Leu He Ser He Pro 
195 i 200 205 

Tyr Val Glu Met Thr Leu Arg Leu Met Glu Arg Phe Gly Val Lys Ala 
210 215 220 

Glu His Ser Asp Ser Trp Asp Arg Phe Tyr He Lys Gly Gly Gin Lys 
225 230 235 240 

Tyr Lys Ser Pro Lys Asn Ala Tyr Val Glu Gly Asp Ala Ser Ser Ala 
245 250 255 

Ser Tyr Phe Leu Ala Gly Ala Ala He Thr Gly Gly Thr Val Thr Val 
260 265 270 

Glu Gly Cys Gly Thr Thr Ser Leu Gin Gly Asp Val Lys Phe Ala Glu 
275 280 285 

Val Leu Glu Met Met Gly Ala Lys Val Thr Trp Thr Glu Thr Ser Val 
290 295 300 

Thr Val Thr Gly Pro Pro Arg Glu Pro Phe Gly Arg Lys His Leu Lys 
305 210 315 320 

Ala lie Asp Val Asn Met Asn Lys Met Pro Asp Val Ala Met Thr Leu 
325 330 335 

Ala Val Val Ala Leu Phe Ala Asp Gly Pro Thr Ala He Arg Asp Val 
340 345 350 

Ala Ser Trp Arg Val Lys Glu Thr Glu Arg Met Val Ala lie Arg Thr 
355 360 365 

Glu Leu Thr Lys Leu Gly Ala Ser Val Glu Glu Gly Pro Asp Tyr Cys 
370 375 380 

He He Thr Pro Pro Glu Lys Leu Asn Val Thr Ala He Asp Thr Tyr 
235 230 395 400 

Asp Asp His Arg Met Ala Met Ala Phe £er Leu Ala Ala Cys Ala Glu 
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val Pro val Thr He Arg Asp Pro Gly Cys Thr Arg Lys Thr Phe Pro 
420 425 430 

Asp Tyr Phe Asp Val Leu Ser Thr Phe Val Lys Asn 
435 440 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Met Glu Ser Leu Thr Leu Gin Pro He Ala Arg Val Asp Gly Ala He 
15 10 15 

Asn Leu Pro Gly Ser Lys Ser Val Ser Asn Arg Ala Leu Leu Leu Ala 
20 25 30 

Ala Leu Ala Cys Gly Lys Thr Val Leu Thr Asn Leu Leu Asp Ser Asp 

35 40 45 

Asp Val Arg His Met Leu Asn Ala Leu Ser Ala Leu Gly lie Asn Tyr 
50 55 60 

Thr Leu Ser Ala Asp Arg Thr Arg Cys Asp He Thr Gly Asn Gly Gly 
65 70 75 80 

Pro Leu Arg Ala Pro Gly Ala Leu Glu Leu Phe Leu Gly Asn Ala Gly 
35 30 ?5 

Thr Ala Met Arg Pro Leu Ala Ala Ala Leu Cys Leu Gly Gin Asn Glu 
100 105 110 

He Val Leu Thr Gly Glu Pro Arg Met Lys Glu Arg Pro lie Gly His 
LIS HO 125 

Leu Val Asp £er Leu Arg Gin Gly Gly Ala Asn Ha Asp Tyr Leu Glu 
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Gin Glu Asn Tyr Pro Pro Leu Arg Leu Arg Gly Gly Phe lie Gly Gly 
145 150 155 160 

Asp lie Glu Val Asp Gly Ser Val Ser Ser Gin Phe Leu Thr Ala Leu 
165 170 175 

Leu Met Thr Ala Pro Leu Ala Pro Lys Asp Thr lie lie Arg Val Lys 
£80 185 190 

Gly Glu Leu Val Ser Lys Pro Tyr lie Asp lie Thr Leu Asn Leu Met 
195 200 205 

Lys Thr Phe Gly Val Glu lie Ala Asn His His Tyr Gin Gin Phe Val 

210 215 220 

Val Lys Gly Gly Gin Gin Tyr His Ser Pro Gly Arg Tyr Leu Val Glu 
225 230 235 240 

Gly Asp Ala Ser Ser Ala Ser Tyr Phe Leu Ala Ala Gly Ala lie Lys 
245 250 255 

Gly Gly Thr Val Lys Val Thr Gly He Gly Arg Lys Ser Met Gin Gly 
260 265 270 

Asp He Arg Phe Ala Asp Val Leu Glu Lys Met Gly Ala Thr He Thr 
275 280 285 

Trp Gly Asp Asp Phe lie Ala Cys Thr Arg Gly Glu Leu His Ala He 
:90 295 300 

Asp Met Asp Met Asn His He Pro Asd Ala Ala Met Thr He Ala Thr 
305 310 315 320 

Thr Ala Leu Phe Ala Lys Gly Thr Thr Thr Leu Arg Asn He Tyr Asn 
325 330 335 

Trp Arg Val Lys Glu Thr Asp Arg Leu Phe Ala Met Ala Thr Glu Leu 
340 345 350 

Arg Lys Val Gly Ala Glu Val Glu Glu Gly His Asp Tyr He Arg He 
355 250 365 

Thr Pro Fro Ala Lys Leu Gin His Ala Asp He Gly Thr Tyr Asn Asp 
270 2~5 330 



His Arg Met Ala Met Cys Phe Ser Leu Val Ala Leu Ser Asp Thr Pro 
285 230 395 400 
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VaX Thr He Leu Asp Pro Lys Cys Thr Ala Lys Thr Phe Pro Asp Tyr 
405 410 415 

Phe Glu Gin Leu Ala Arg Met Ser Thr Pro Ala 
420 425 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 427 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Met Glu Ser Leu Thr Leu Gin Pro He Ala Arg Val Asp Gly Ala He 
15 10 15 

Asn Leu Pro Gly Ser Lys Ser Val Ser Asn Arg Ala Leu Leu Leu Ala 
20 25 30 

Ala Leu Ala Cys Gly Lys Thr Val Leu Thr Asn Leu Leu Asp Ser Asp 
35 40 45 

Asp Val Arg His Met Leu Asn Ala Leu Ser Ala Leu Gly He Asn Tyr 

50 55 50 

Thr Leu Ser Ala Asp Arg Thr Arg Cys Asp He Thr Gly Asn Gly Gly 
65 70 75 80 

Pro Leu Arg Ala Ser Gly Thr Leu Glu Leu Phe Leu Gly Asn Ala Gly 
*5 90 95 

Thr Ala Met Arg Pro Leu Ala Ala Ala Leu Cys Leu Gly Gin Asn Glu 
100 105 HO 

lie Val Leu Thr Glv Glu Pro Ara Met Lvs Glu Arg Pro He Gly His 
115 120 125 



Leu Val .\sv Ser Leu Arg Gin Gly Gly Ala Asn He Asp Tyr Leu Glu 
130 135 140 
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Gin Glu Asn Tyr Pro Pro Leu Arg Leu Arg Gly Gly Phe lie Gly Gly 
145 150 155 160 

Asp lie Glu Val Asp Gly Ser Val Ser Ser Gin Phe Leu Thr Ala Leu 
165 170 175 

Leu Met Thr Ala Pro Leu Ala Pro Glu Asp Thr He He Arg Val Lys 
£80 185 190 

Gly Glu Leu Val Ser Lys Pro Tyr He Asp He Thr Leu Asn Leu Met 
195 200 205 

Lys Thr Phe Gly Val Glu He Ala Asn His His Tyr Gin Gin Phe Val 

210 215 220 

Val Lys Gly Gly Gin Gin Tyr His Ser Pro Gly Arg Tyr Leu Val Glu 
225 230 235 240 

Gly Asp Ala Ser Ser Ala Ser Tyr Phe Leu Ala Ala Gly Gly He Lys 
245 250 255 

Gly Gly Thr Val Lys Val Thr Gly He Gly Gly Lys Ser Met Gin Gly 
260 265 270 

Asp He Arg Phe Ala Asp Val Leu His Lys Met Gly Ala Thr He Thr 
275 280 285 

Trp Gly Asp Asp ?he He Ala Cys Thr Arg Gly Glu Leu His. Ala lie 
290 295 300 

Asp Met Asp Met Asn His He Pro Asp Ala Ala Met Thr He Ala Thr 
305 310 315 320 

Thr Ala Leu Phe Ala Lys Gly Thr Thr Thr Leu Arg Asn He Tyr Asn 
325 330 335 

Trp Arg Val Lys Glu Thr Asp Arg Leu Phe Ala Met Ala Thr Glu Leu 
340 345 350 

Arg Lys Val Gly Ala Glu Val Glu Glu Gly His Asp Tyr lie Arg He 
355 360 365 

Thr Pro Pro Ala Lvs Leu Gin His Ala Asn He Gly Thr Tyr Asn Asp 
370 * 375 280 



His Ara Met Ala Met Cys Phe Ser Leu Val Ala Leu Ser Asp Thr Pro 
385 " 390 395 400 
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Val Thr lie Leu Asp Pro Lys Cys Thr Ala Lys Thr Phe Pro Asp Tyr 
405 410 415 

Phe Glu Gin Leu Ala Arg Met Ser Thr Pro Ala 
420 425 

(2) INFORMATION FOR SEQ ID NO: 59: 

( i ) SEQUENCE "CHARACTERISTICS : 

(A) LENGTH: 427 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Met Glu Ser Leu Thr Leu Gin Pro He Ala Arg Val Asp Gly Thr Val 
15 10 15 

Asn Leu Pro Gly Ser Lys Ser Val Ser Asn Arg Ala Leu Leu Leu Ala 
20 25 30 

Ala Leu Ala Arg Gly Thr Thr Val Leu Thr Asn Leu Leu Asp Ser Asp 
35 40 45 

Asp Val Arg His Met Leu Asn Ala Leu Ser Ala Leu Gly Val His 'Tyr 
50 55 60 

Val Leu Ser. Ser Asp Arg Thr Arg Cys Glu Val Thr Gly Thr Gly Gly 
65 70 75 80 

Pro Leu Gin Ala Gly Ser Ala Leu Glu Leu Phe Leu Gly Asn Ala Gly 
35 ?0 ?5 

Thr Ala Met Arg Pro Leu Ala Ala Ala Leu Cys Leu Gly Ser Asn Asp 
100 105 110 

He Val Leu Thr Giy Glu Pro Ara Met Lvs Glu Arg Pro He Gly His 
115 120 " 125 



Leu Val Asp Ala Leu Arg Gin Gly Gly Ala Gin He Asp Tyr Leu Glu 
120 135 140 
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Gin Glu Asn Tyr Pro Pro Leu Arg Leu Arg Gly Gly Phe Thr Gly Gly 
145 150 155 160 

Asp Val Glu Val Asp Gly Ser Val Ser Ser Gin Phe Leu Thr Ala Leu 
165 170 175 

Leu Met Ala Ser Pro Leu Ala Pro Gin Asp Thr Val lie Ala lie Lys 
jlBO 185 190 

Gly Glu Leu Val Ser Arg Pro Tyr lie Asp lie Thr Leu His Leu Met 
195 200 205 

Lys Thr Phe Gly Val Glu Val Glu Asn Gin Ala Tyr Gin Arg Phe He 
210 215 220 

Val Arg Gly Asn Gin Gin Tyr Gin Ser Pro Gly Asp Tyr Leu Val Glu 
225 230 235 240 

Gly Asp Ala Ser Ser Ala Ser Tyr Phe Leu Ala Ala Gly Ala lie Lys 
245 250 255 

Gly Gly Thr Val Lys Val Thr Gly He Gly Arg Asn Ser Val Gin Gly 
260 265 270 

Asp He Arg Phe Ala Asp Val Leu Glu Lys Met Gly Ala Thr Val Thr 
275 280 285 

Trp Gly Glu Asp Tyr lie Ala Cys Thr Arg Gly Glu Leu Asn Ala He 

290 295 300 

Asp Met Asp Met Asn His He Fro Asp Ala Ala Met Thr He Ala Thr 
305 310 315 320 

Ala Ala Leu Phe Ala Arg Gly Thr Thr Thr Leu Arg Asn He Tyr Asn 
325 330 335 

Trp Arg Val Lys Glu Thr Asp Arg Leu Phe Ala Met Ala Thr Glu Leu 
340 345 350 

Arg Lys Val Gly Ala Glu Val Gl^ Glu Gly Glu Asp Tyr lie Arg lie 



365 




Gin Phe Ala Glu 



He Gly 
3 80 



Thr 



Vr Asn Asp 



His Ara Met Ala Met Cvs Phe £er Leu Val Ala Leu Ser Asp Thr Pro 
285 " 390 395 400 
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Val Thr lie Leu Asp Pro Lys Cys Thr Ala Lys Thr Phe Pro Asp Tyr 
405 410 415 

Phe Gly Gin Leu Ala Arg lie Ser Thr Leu Ala 
420 425 

(2) INFORMATION FOR SEQ ID NO: 60: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Met Leu Glu Ser Leu Thr Leu His Pro lie Ala Leu He Asn Gly Thr 
15 10 15 

val Asn Leu Pro Gly Ser Lys Ser Val Ser Asn Arg Ala Leu Leu Leu 
20 25 30 

Ala Ala Leu Ala Glu Gly Thr Thr Gin Leu Asn Asn Leu Leu Asp Ser 
35 40 45 

Asp Asp lie Arg His Met leu Asn Ala Leu Gin Ala Leu Gly Val Lys 
5 0 5 5 50 

Tyr Arg Leu Ser Ala Asp Arg Thr Arg Cys Glu Val Asp Gly Leu Gly 
65 70 75 80 

Gly Lys Leu Val Ala Glu Gin Pro Leu Glu Leu Phe Leu Gly Asn Ala 
35 90 95 

Gly Thr Ala Met Arg Pro Leu Ala Ala Ala Leu Cys Leu Gly Lys Asn 
100 105 110 

Asp He Val Leu Thr Gly Glu Pro Arg Met Lys Glu Arg Pro lie Gly 
115 120 125 



His Leu 7a i Asp Ala Leu Arg Glr. Gly Gly Ala Gin He Asp Tyr Leu 
130 135 140 
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Glu Gin Glu Asn Tyr Arg Arg Cys He Ala Gly Gly Phe Arg Gly Gly 
145 150 155 160 

Lys Leu Thr Val Asp Gly Ser Val Ser Ser Gin Phe Leu Thr Ala Leu 
165 170 175 

Leu Met Thr Ala Pro Leu Ala Glu Gin Asp Thr Glu He Gin lie Gin 
180 185 190 

Gly Glu Leu Val Ser Lys Pro Tyr He Asp He Thr Leu His Leu Met 
195 200 205 

Lys Ala Phe Gly Val Asp Val Val His Glu Asn Tyr Gin He Phe His 
210 215 220 

He Lys Gly Gly Gin Thr Tyr Arg Ser Pro Gly He Tyr Leu Val Glu 
225 230 235 240 

Gly Asp Ala Ser Ser Ala Ser Tyr Phe Leu Ala Ala Ala Ala He Lys 
245 250 255 

Gly Gly Thr Val Arg Val Thr Gly He Gly Lys Gin Ser Val Gin Gly 
260 265 270 

Asp Thr Lys Phe Ala Asp Val Leu Glu Lys Met Gly Ala Lys lie Ser 
275 280 285 

Trp Gly Asp Asp Tyr He Glu Cys Ser Arg Gly Glu Leu Gin Gly He 
290 295 300 

Asp Met Asp Met Asn His He Pro Asp Ala Ala Met Thr He Ala Thr 
305 310 315 320 

Thr Ala Leu Phe Ala Asp Gly Pro Thr Val He Arg Asn He Tyr Asn 
325 330 335 

Trp Arg Val Lys Glu Thr Asp Arg Leu Ser Ala Met Ala Thr Glu Leu 
340 345 350 

Arg Lys Val Gly Ala Glu Val Glu Glu Gly Gin Asp 'Tyr He Arg Val 
355 260 365 

Val Pro Pro Ala Gin Leu He Ala Ala Glu He Gly Thr Tyr Asn Asp 
•70 375 3S0 

His Arg Met Ala Met Cys Phe Ser Leu Val Ala Leu Ser Asp Thr Pro 
335 390 395 400 
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Val Thr lie Leu Asp Pro Lys Cys Thr Ala Lys Thr Phe Pro Asp Tyr 
405 410 . 415 

Phe Glu Gin Leu Ala Arg Leu Ser Gin lie Ala 
420 425 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 432 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Met Glu Lys He Thr Leu Ala Pro He Ser Ala Val Glu Gly Thr He 
15 10 15 

Asn Leu Pro Gly Ser Lys Ser Leu Ser Asn Arg Ala Leu Leu Leu Ala 
20 25 30 

Ala Leu Ala Lys Gly Thr Thr Lys Val Thr Asn Leu Leu Asp Ser Asp 
35 40 45 

Asp He Arg His Met Leu Asn Ala Leu Lys Ala Leu Gly Val Arg Tyr 
50 35 50 

Gin Leu Ser Asp Asp Lys Thr He Cys Glu He Glu Gly Leu Gly Gly 
65 7 0 75 80 

Ala Phe Asn He Gin Asp Asn Leu Ser Leu Phe Leu Gly Asn Ala Gly 
35 50 35 

Thr Ala Met Arg Pro Leu Thr Ala Ala Leu Cys Leu Lys Gly Asn His 
100 105 110 

Glu Vai Glu He He Leu Thr Glv Glu Pro Arg Met Lys Glu Arg Pro 
115 1ZD 125 



He Leu His Leu Vai Asp Ala Leu Arg C-ln Ala Gly Ala Asp He Arg 
130 135 140 
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Tyr Leu Glu Asn Glu Gly Tyr Pro Pro Leu Ala lie Arg Asn Lys Gly 
145 150 155 160 

He Lys Gly Gly Lys Val Lys He Asp Gly Ser He Ser Ser Gin Phe 
165 170 175 

Leu Thr Ala Leu Leu Met Ser Ala Pro Leu Ala Glu Asn Asp Thr Glu 
•180 185 190 

He Glu He He Gly Glu Leu Val Ser Lys Pro Tyr He Asp He Thr 
195 200 205 

Leu Ala Met Met Arg Asp Phe Gly Val Lys Val Glu Asn His His Tyr 

210 215 220 

Gin Lys Phe Gin Val Lys Gly Asn Gin Ser Tyr He Ser Pro Asn Lys 
225 230 235 240 

Tyr Leu Val Glu Gly Asp Ala Ser Ser Ala Ser Tyr Phe Leu Ala Ala 
245 250 255 

Gly Ala He Lys Gly Lys Val Lys Val Thr Gly He Gly Lys Asn Ser 
260 265 270 

He Gin Gly Asp Arg Leu Phe Ala Asp Val Leu Glu Lys Met Gly Ala 
275 280 285 

Lys He Thr Trp Gly Glu Asp Fhe He Gin Ala Glu His Ala Glu Leu 
190 295 300 

Asn Gly He Asp Met Asp Met Asn His He Pro Asp Ala Ala Met Thr 
305 310 315 320 

He Ala Thr Thr Ala Leu Phe Ser Asn Gly Glu Thr Val He Arg Asn 
325 330 335 

He Tyr Asn Trp Arg Val Lys Glu Thr Asp Arg Leu Thr Ala Met Ala 
340 345 350 

Thr Glu Leu Arg Lys Val Gly Ala Glu Val Glu Glu Gly Glu Asp Phe 



3 55 



3 60 



365 




.e Gin Pro Leu Ala Leu Asn Gin Phe Lys His Ala Asn He 
:75 330 



Glu Thr Tyr Asn Asp His Arg Met Ala Met Cys Phe Ser Leu He Ala 
385 390 395 400 



-177- 



38-21(10660)A 



Leu Ser Asn Thr Pro Val Thr lie Leu Asp Pro Lys Cys Thr Ala Lys 
405 410 415 

Thr Phe Pro Thr Phe Phe Asn Glu Phe Glu Lys lie Cys Leu Lys Asn 
420 425 430 



(2) INFORMATION FpR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 441 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 



Val lie Lys Asp 

1 

Glu Gly Glu Val 
20 

Leu Leu Leu Ser 
35 

Leu Asp Ser Asp 
5 0 

Gly Val Thr Tyr 
65 

Gly Leu Gly Arg 



Gly Asn Ala Gly 
100 

Ser Thr Pro Asn 
115 



Ala Thr Ala He 
5 

Arg Leu Pro Gly 



Ala Leu Ala Lys 

40 

Asp Val Arg His 



Gin Leu Ser Glu 
70 

Ala Phe Glu Trp 
35 

Thr Ala Met Arg 



Arg Glu Gly Lys 
120 



Thr Leu Asn Pro 
10 

Ser Lys Ser Leu 
25 

Gly Lys Thr Thr 



Met Leu Asn Ala 
50 

Asp Lys Ser Val 
75 

Gin Ser Gly Leu 
90 

Pro Leu Thr Ala 
105 

Asn Glu He Val 



He Ser Tyr He 
15 

Ser Asn Arg Ala 

30 

Leu Thr Asn Leu 

45 

Leu Lys Glu Leu 



Cys Glu He Glu 
80 

Ala Leu Phe Leu 
95 

Ala Leu Cys Leu 
110 

Leu Thr Gly Glu 
125 



Pro Arg Met Lys Glu Arg Pro He Gin His Leu Val Asp Ala Leu Cys 
130 ' " 135 14 0 
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Gin Ala Gly Ala Glu He Gin Tyr Leu Glu Gin Glu Gly Tyr Pro Pro 
145 150 155 160 

He Ala He Arg Asn Thr Gly Leu Lys Gly Gly Arg He Gin He Asp 
165 170 175 

Gly Ser Val Ser Ser Gin Phe Leu Thr Ala Leu Leu Met Ala Ala Pro 
180 185 190 

Met Ala Glu Ala Asp Thr Glu He Glu He He Gly Glu Leu Val Ser 
195 200 205 

Lys Pro Tyr He Asp He Thr Leu Lys Met Met Gin Thr Phe Gly Val 
210 215 220 

Glu Val Glu Asn Gin Ala Tyr Gin Arg Phe Leu Val Lys Gly His Gin 
225 230 235 240 

Gin Tyr Gin Ser Pro His Arg Phe Leu Val Glu Gly Asp Ala Ser Ser 
245 250 255 

Ala Ser Tyr Phe Leu Ala Ala Ala Ala He Lys Gly Lys Val Lys Val 
260 265 270 

Thr Gly Val Gly Lys Asn Ser He Gin Gly Asp Arg Leu Phe Ala Asp 
275 280 285 

Val Leu Glu Lys Met Gly Ala His He Thr Trp Gly Asp Asp Phe He 
290 295 300 

Gin 'Val Glu Lvs Gly Asn Leu Lys Gly He Asp Met Asp Met Asn His 
205 ' 310 315 320 

He Pro Asp Ala Ala Met Thr He Ala Thr Thr Ala Leu Phe Ala Glu 
325 330 335 

Gly Glu Thr Val He Arg Asn He Tyr Asn Trp Arg Val Lys Glu Thr 
340 345 350 

Asp Arg Leu Thr Ala Met Ala Thr Glu Leu Arg Lys Val Gly Ala Glu 
355 260 . 365 

Val Glu Glu Glv Glu Asd Phe He Arg He Gin Pro Leu Asn Leu Ala 
370 375 380 

Gin Phe Gin His Ala Glu Leu Asn He His Asp His Arg Met Ala Met 



-179- 



38-21(10660)A 



Cys Phe Ala Leu lie Ala Leu Ser Lys Thr Ser Val Thr lie Leu Asp 
405 410 415 

Pro Ser Cys Thr Ala Lys Thr Phe Pro Thr Phe Leu He Leu Phe Thr 
420 425 430 

Leu Asn Thr Arg Glu Val Ala Tyr Arg 
435 440 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 426 amino acids 

(B) TYPE: amino acid 
iD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Asn Ser Leu Arg Leu Glu Pro He Ser Arg Val Ala Gly Glu Val Asn 
15 10 15 

Leu Pro Gly Ser Lys Ser Val Ser Asn Arg Ala Leu Leu Leu Ala Ala 

20 25 30 

Leu Ala Arg Gly Thr Thr Arg Leu Thr Asn Leu Leu Asp Ser Asp Asp 

3 5 40 45 

He Arg His Met Leu Ala Ala Leu Thr Gin Leu Gly Val Lys Tyr Lys 
50" 55 60 

Leu Ser Ala Asp Lys Thr Glu Cys Thr Val His Gly Leu Gly Arg Ser 
55 70 75 80 

Phe Ala Val Ser Ala Pro Val Asn Leu Phe Leu Gly Asn Ala Gly Thr 
35 ?0 95 

Ala Met Arg Pro Leu Cys Ala Ala Leu Cys Leu Gly Ser Gly Glu Tyr 
100 105 110 

Met Leu Gly Gly Glu Pro Arg Met Glu Glu Arg Pro He Gly His Leu 
115 120 125 

Val Asp Cys Leu Ala Leu Lys Gly Ala His He Gin Tyr Leu Lys Lys 
133 135 140 
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Asp Gly Tyr Pro Pro Leu Val Val Asp Ala Lys Gly Leu Trp Gly Gly 
145 150 155 160 

Asp Val His Val Asp Gly Ser Val Ser Ser Gin Phe Leu Thr Ala Phe 
165 170 175 

Leu Met Ala -Ala Pro Ala Met Ala Pro Val lie Pro Arg He His He 
180 185 190 

Lys Gly Glu Leu Val Ser Lys Pro Tyr He Asp He Thr Leu His He 
195 200 205 «. 

Met Asn Ser Ser Gly Val Val He Glu His Asp Asn Tyr Lys Leu Phe 

210 215 220 

Tyr He Lys Gly Asn Gin Ser He Val Ser Pro Gly Asp Phe Leu Val 
225 230 235 240 

Glu Gly Asp Ala Ser Ser Ala Ser Tyr Phe Leu Ala Ala Gly Ala He 
245 250 255 

Lys Gly Lys Val Arg Val Thr Gly He Gly Lys His Ser He Gly Asp 
260 265 270 

He His Phe Ala Asp Val Leu Glu Arg Met Gly Ala Arg He Thr Trp 
275 280 285 

Gly Asd Asp Phe lie Glu Ala Glu Gin Gly Pro Leu His Gly Val Asp 

290 195 300 

Met Asd Met Asn His lie Pro Asp Val Gly His Asp His Ser Gly Gin 
305 " 310 315 320 

Ser His Cvs Leu Pro Arg Val Pro Pro His Ser Gin His Leu Gin Leu 
225 330 335 

Ala Val Arg Asp Asp Arg Cys Thr Pro Cys Thr His Gly .His Arg Arg 
340 345 350 

Ala Gin Ala Gly Val Ser Glu Glu Gly Thr Thr Phe He Thr Arg Asp 



•55 



360 



Ala Ala A: 



Pro 



Ala Gin 



Arg Arg 



sp Arg His Leu Gin Arg Ser 
230 



Arg He Ala Met Cvs Phe 
•35 390 



■er Leu Val 



.a Leu 
2 95 



Ser Asp He Ala Val 
400 
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Thr lie Asn Asp Pro Gly Cys Thr Ser Lys Thr Ph, Pro Asp Tyr Phe 
405 410 415 

Asp Lys Leu Ala Ser Val Ser Gin Ala Val 
420 425 

INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 442 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear v 
(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 



20 25 30 

Leu Leu Ala Ala Leu Ala Glu Gly Ser Thr Glu He Thr Gly Leu Leu 



Val Ser Val Gly Glu Val Ala Asp Gly Cys Val Thr He Glu Gly Val 
65 70 75 8.0 



Leu Val Asd Ala Leu Ara Gin ?he Glv Ala Gly Ha Glu Tyr Leu Gly 
130 135 140 
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Gin Ala Gly Tyr Pro Pro Leu Arg He Gly Gly Gly Ser He Arg Val 
145 150 155 160 

Asp Gly Pro Val Arg Val Glu Gly Ser Val Ser Ser Gin Phe Leu Thr 
165 170 175 

Ala Leu Leu Met Ala Ala Pro Val Leu Ala Arg Arg Ser Gly Gin Asp 
180 185 190 

He Thr He Glu Val Val Gly Glu Leu He Ser Lys Pro Tyr He Glu 
195 200 205 * 

He Thr Leu Asn Leu Met Ala Arg Phe Gly Val Ser Val Arg Arg Asp 

210 215 220 

Gly Trp Arg Ala Phe Thr He Ala Arg Asp Ala Val Tyr Arg Gly Pro 
225 230 235 240 

Gly Arg Met Ala He Glu Gly Asp Ala Ser Thr Ala Ser Tyr Phe Leu 
245 250 255 

Ala Leu Gly Ala He Gly Gly Gly Pro Val Arg Val Thr Gly Val Gly 
260 265 270 

Glu Asd Ser He Gin Gly Asp Val Ala Phe Ala Ala Thr Leu Ala Ala 
275 280 285 

Met Glv Ala Asp Val Arg Tyr Gly Pro Gly Trp He Glu Thr Arg Gly 

:90 295 300 

Val Arg Val Ala Glu Gly Gly Arg Leu Lys Ala Phe Asp Ala Asp Phe 



*sn Leu He Pro Asp Ala Ala Met Thr Ala Ala Thr Leu Ala Leu Tyr 
325 330 "5 

Ala Aso Glv Pro Cvs Arg Leu Arg Asn He Gly Ser Trp Arg Val Lys 
340 345 350 

Glu Thr Asp Arg He His Ala Met His Thr Glu Leu Glu Lys Leu Gly 



350 



365 



Val Gin Ser Gly Ala Asp Trp Leu Glu Val Ala Pro Pro Glu 



hr Tro Asp Asp His Arg 
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a — ----- - valArg 

« y , hrPheP „ Aspivr ;; 6 5 Asp 

430 

^ % ^ Al. «. Arg Asp 

440 

) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS- 

J A) LENGTH: 427 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:65: 

r Giu ser l - r Leu Gin p - ^ ™ ASP GIy A , a Ile 

Leu Pro ciy Ser Lys S er Val Ser Asn Arg Ala Leu Leu ^ ^ 
^ 5 in 



-a Cys Gly Lys Thr Vai Leu Thr Asn Leu 



Ala Leu Ala _ _ _ _ 

eu Leu Asp Ser Asp 
ASP V.l ,r 5 Hl . Met Leu Af A1 . s „ A1 . Lau cly Ile ^ r 



65 ~ V ^ Arg CyS As ? t le Ihr G1 ^ Asn GIV Gly 

5 80 



Leu Ser Ala Asp Arg Thr Arg Cys 
70 

Pro Leu Arg Ala Ser Gly Thr Leu ri„ r ail DK = - 

xy mr Leu blu lau Phe ieu Gly Asn Ala Gly 

=5 ?0 95 

Thr Ala „ eC Arg Pro T,eu Ala Ala Ala Leu Cys Leu Gly Gin Asn Glu 
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Leu Val Asp Ser Leu Arg Gin Gly Gly Ala Asn lie Asp Tyr Leu Glu 
130 135 140 

Gin Glu Asn Tyr Pro Pro Leu Arg Leu Arg Gly Gly Phe He Gly Gly 
145 150 155 160 

Asp He Glu Val Asp Gly Ser Val Ser Ser Gin Phe Leu Thr Ala Leu 
165 170 175 

Leu Met Thr Ala Pro Leu Ala Pro Glu Asp Thr He He Arg Val Lys 
180 185 190 

Gly Glu Leu Val Ser Lys Pro Tyr He Asp He Thr Leu Asn Leu Met 
195 200 205 

Lys Thr Phe Gly Val Glu He Ala Asn His His Tyr Gin Gin Phe Val 
210 215 220 

Val Lys Gly Gly Gin Gin Tyr His Ser Pro Gly Arg Tyr Leu Val Glu 
225 230 235 240 

Gly Asp Ala Ser Ser Ala Ser Tyr Phe Leu Ala Ala Gly Gly He Lys 
245 250 255 

Gly Gly Thr Val Lys Val Thr Gly He Gly Gly Lys Ser Met Gin Gly 
260 265 270 

Asp He Arg Phe Ala Asp Val Leu His Lys Met Gly Ala Thr He Thr 
275 280 -85 

Trp Gly Asp Asp Phe lie Ala Cys .Thr Arg Gly Giu Leu His Ala He 
290 " 295 300 

Asp Met Asp Met Asn His He Pro Asp Ala Ala Met Thr He Ala Thr 
■05 310 315 320 

Thr Ala Leu Phe Aia Lvs Gly Thr Thr Thr Leu Arg Asn He Tyr Asn 
225 " 330 335 

Trp Ara Val Lys Glu Thr Asp Arg Leu Phe Ala Met Ala Thr Glu Leu 
340 345 350 

Arg Lys Val Gly Ala Glu Val Glu Glu Gly His Asp Tyr He Arg He 



^hr D *-o Fro Ala Lvs Leu Gin His Ala A sd He Gly Thr Tyr Asn Asp 
370 " 375 380 
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His Arg Met Ala Met Cys Phe Ser Leu Val Ala Leu Ser Asp Thr Pro 
385 390 395 " 400 

Val Thr lie Leu Asp Pro Lys Cys Thr Ala Lys Thr Phe Pro Asp Tyr 



Phe Glu Gin Leu Ala Arg Met Ser Thr Pro Ala 
420 425 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1894 base pairs 
(BS TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 275 . .1618 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

ACGGGCTGTA ACGGTAGTAG GGGTCCCGAG CACAAAAGCG GTGCCGGCAA GCAGAACTAA 

TTTCCATGGG GAATAATGGT ATTTCATTGG TTTGGCCTCT GGTCTGGCAA TGGTTGCTAG 

GCGATCGCCT GTTGAAATTA ACAAACTGTC GCCCTTCCAC TGACCATGGT AACGATGTTT 

TTTACTTCCT TGACTAACCG AGGAAAATTT GGCGGGGGGC AGAAATGCCA ATACAATTTA 

GCTTGGTCTT CCCTGCCCCT AATTTGTCCC CTCC ATG GCC TTG CTT TCC CTC 

Met Ala Leu Leu Ser Leu 

AAC AAT CAT CAA TCC CAT CAA CGC TTA ACT GTT AAT CCC CCT GCC CAA 
Atn fsn His Gin Ser H,s Gin Arg Leu Thr Val Asn Pro Pro Ala ,-,n 



180 
240 
292 
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TCC CAT CGG GCC TTG ATG TTG GGG GCG ATC GCC ACC GGG GAA ACC ATT 
Ser His Arg Ala Leu Met Leu Gly Ala He Ala Thr Gly Glu Thr He 
40 45 50 

ATC GAA GGG CTA CTG TTG GGG GAA GAT CCC CGT AGT ACG GCC CAT TGC 
He Glu Gly Leu Leu Leu Gly Glu Asp Pro Arg Ser Thr Ala His Cys 
55 60 65 70 

TTT CGG GCC ATG GGA GCA GAA ATC AGC GAA CTA AAT TCA GAA AAA ATC 
Phe Arg Ala Met Gly Ala Glu He Ser Glu Leu Asn Ser Glu Lys He 
75 80 85 

ATC GTT CAG GGT CGG GGT CTG GGA CAG TTG CAG GAA CCC AGT ACC GTT 

He Val Gin Gly Arg Gly Leu Gly Gin Leu Gin Glu Pro Ser Thr Val 

90 95 100 

TTG GAT GCG GGG AAC TCT GGC ACC ACC ATG CGC TTA ATG TTG GGC TTG 
Leu Asp Ala Gly Asn Ser Gly Thr Thr Met Arg Leu Met Leu Gly Leu 
105 110 H5 

CTA GCC GGG CAA AAA GAT TGT TTA TTC ACC GTC ACC GGC GAT GAT TCC 
Leu Ala Gly Gin Lys Asp Cys Leu Phe Thr Val Thr Gly Asp Asp Ser 
120 125 130 

CTC CGT CAC CGC CCC ATG TCC CGG GTA ATT CAA CCC TTG CAA CAA ATG 
Leu Arg His Arg Pro Met Ser Arg Val He Gin Pro Leu Gin Gin Met 
135 140 145 150 

GGG GCA AAA ATT TGG GCC CGG AGT AAC GGC AAG TTT GCG CCG CTG GCA 
Gly Ala Lvs He Tro Ala Ara Ser Asn Gly Lys Phe Ala Pro Leu Ala 
155 160 155 

GTC CAG GGT AGC CAA TTA AAA CCG ATC CAT 7 AC CAT TCC CCC ATT GCT 
Val Gin Gly Ser Gin Leu Lys Pro He His Tyr His Ser Pro He Ala 
17 0 175 180 

^CA GCC CAG GTA AAG TCC TGC CTG TTG CTA GCG GGG TTA ACC ACC GAG 
=er Ala Gin Val Lys Ser Cys Leu Leu Leu Ala Gly Leu Thr Thr Glu 
185 190 195 

GGG GAC ACC ACG GTT ACA GAA CCA GCT CTA TCC CGG GAT CAT AGC GAA 
Gly Asd Thr Thr Val Thr Glu Pro Ala Leu Ser Arg Asp His Ser Glu 
200 205 210 

^GC ATG TTG CAG GCC TTT GGA GCC AAA TTA ACC ATT GAT CCA GTA ACC 
:rg Met Leu Gin Ala Phe Gly Ala Lys Leu Thr He Asp Pro Val Thr 
:i5 220 1-5 -30 
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CAT AGC GTC ACT GTC CAT GGC CCG GCC CAT TTA ACG GGG CAA CGG GTG 
His Ser Val Thr Val His Gly Pro Ala His Leu Thr Gly Gin Arg Val 
235 240 245 

GTG GTG CCA GGG GAC ATC AGC TCG GCG GCC TTT TGG TTA GTG GCG GCA 
Val Val Pro Gly Asp He Ser Ser Ala Ala Phe Trp Leu Val Ala Ala 
250 255 260 

TCC ATT TTG CCT GGA TCA GAA TTG TTG GTG GAA AAT GTA GGC ATT AAC 
Ser He Leu Pro Gly Ser Glu Leu Leu Val Glu Asn Val Gly He Asn 
265 270 275 

CCC ACC AGG ACA GGG GTG TTG GAA GTG TTG GCC CAG ATG GGG GCG GAC 
Pro Thr Arg Thr Gly Val Leu Glu Val Leu Ala Gin Met Gly Ala Asp 

280 285 290 

ATT ACC CCG GAG AAT GAA CGA TTG GTA ACG GGG GAA CCG GTA GCA GAT 
lie Thr Pro Glu Asn Glu Arg Leu Val Thr Gly Glu Pro Val Ala Asp 
295 300 305 310 

CTG CGG GTT AGG GCA AGC CAT CTC CAG GGT TGC ACC TTC GGC GGC GAA 
Leu Arg Val Arg Ala Ser His Leu Gin Gly Cys Thr Phe Gly Gly Glu 
315 320 325 

ATT ATT CCC CGA CTG ATT GAT GAA ATT CCC ATT TTG GCA GTG GCG GCG 
He He Pro Arg Leu He Asp Glu He Pro He Leu Ala Val Ala Ala 
330 335 340 

GCC TTT GCA GAG GGC ACT ACC CGC ATT GAA GAT GCC GCA GAA CTG AGG 
Ma Phe Ala Glu Gly Thr Thr Arg He Glu Asp Ala Ala Glu Leu Arg 
2 45 3 50 3 55 

GTT AAA GAA AGC GAT CGC CTG GCG GCC ATT GCT TCG GAG TTG GGC AAA 
Val Lys Glu Ser Asp Arg Leu Ala Ala He Ala Ser Glu Leu Gly Lys 
360 365 



370 



ATG GGG GCC AAA GTC ACC GAA TTT GAT GAT GGC CTG GAA ATT CAA GGG 
Met Gly Ala Lys Val Thr Glu Phe Asp Asp Gly Leu Glu He Gin Gly 
375 380 385 390 

GGA AGC CCG TTA CAA GGG GCC GAG GTG GAT AGC TTG ACG GAT CAT CGC 
Glv Ser Pro Leu Gin Gly Ala Glu Val Asp Ser Leu Thr Asp His Arg 
1 "' 395 400 405 

-TT rrr ATG C-CG TTG GCG ATC GCC GCT TTA GGT AGT GGG GGG CAA ACA 
ie Ala Met ;ia Leu Ala lie Ala Ala Leu Gly Ser Gly Gly Gin Thr 
410 415 - 20 
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ATT ATT AAC CGG GCG GAA GCG GCC GCC ATT TCC TAT CCA GAA TTT TTT 1588 
He He Asn Arg Ala Glu Ala Ala Ala He Ser Tyr Pro Glu Phe Phe 
425 430 435 

GGC ACG CTA GGG CAA GTT GCC CAA GGA TAAAGTTAGA AAAACTCCTG 1635 
Gly Thr Leu Gly Gin Val Ala Gin Gly 
440 445 

GGCGGTTTGT AAATGTTTTA CCAAGGTAGT TTGGGGTAAA GGCCCCAGCA AGTGCTGCCA 1695 

GGGTAATTTA TCCGCAATTG ACCAATCGGC ATGGACCGTA TCGTTCAAAC TGGGTAATTC 1755 

TCCCTTTAAT TCCTTAAAAG CTCGCTTAAA ACTGCCCAAC GTATCTCCGT AATGGCGAGT 1815 

GAGTAGAAGT AATGGGGCCA AACGGCGATC GCCACGGGAA ATTAAAGCCT GCATCACTGA 1875 

CCACTTATAA CTTTCGGGA 1894 

(2) INFORMATION FOR SEQ. ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 447 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: 

Met P-ia Leu Leu Ser Leu Asn Asn His Gin Ser His Gin Arg Leu Thr 
: 5 10 -5 

Val Asn Pro Pro Ala Gin Gly Val Ala Leu Thr Gly Arg Leu Arg Val . 

20 25 2 0 

Pro Glv Aso Lvs Ser He Ser His Arg Ala Leu Met Leu Gly Ala He 
35 40 -5 

Ala Thr Glv Glu Thr He He Glu Gly Leu Leu Leu Gly Glu Asp Pro 
50 55 oO 



Fhe Arg Ala Met Gly Ala Glu 



He Ser Glu 



Leu Asn Ser Glu Lys He He Val Gin Gly Arg Gly Leu Gly Gin Le 
35 30 ?5 
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Gin Glu Pro Ser Thr Val Leu Asp Ala Gly Asn Ser Gly Thr Thr Met 
100 105 110 

Arg Leu Met Leu Gly Leu Leu Ala Gly Gin Lys Asp Cys Leu Phe Thr 
115 120 125 

Val Thr Gly Asp Asp Ser Leu Arg His Arg Pro Met Ser Arg Val lie 
130 135 140 

Gin Pro Leu Gin Gin Met Gly Ala Lys He Trp Ala Arg Ser Asn Gly 
145 150 155 160 

Lys Phe Ala Pro Leu Ala Val Gin Gly Ser Gin Leu Lys Pro ILe His 

165 170 175 

Tyr His Ser Pro He Ala Ser Ala Gin Val Lys Ser Cys Leu Leu Leu 
180 185 190 

Ala Gly Leu Thr Thr Glu Gly Asp Thr Thr Val Thr Glu Pro Ala Leu 
195 200 205 

Ser Arg Asp His Ser Glu Arg Met Leu Gin Ala Phe Gly Ala Lys Leu 
210 215 220 

Thr He Asp Pro Val Thr His Ser Val Thr Val His Gly Pro Ala His 
225 230 235 240 

Leu Thr Gly Gin Arg Val Val Val Pro Gly Asp He Ser Ser Ala Ala 
245 250 255 

Phe Trp Leu Val Ala Aia Ser He Leu Pro Gly Ser Glu Leu Leu Val 
250 265 m ^70 

Glu Asn Val Gly He Asn Pro Thr Arg Thr Gly Val Leu Glu Val Leu 
275 280 285 

Ala Gin Met Glv Ala Asp lie Thr Pro Glu Asn Glu Arg Leu Val Thr 
290 295 300 

Gly Glu Pro Val Ala Asd Leu Arg Val Arg Aia Ser His Leu Gin Gly 
i05 310 315 320 

r ys Thr Phe Giv Glv Glu He He Pro Arg Leu He Asp Glu He Pro 
•25 330 335 



"e Leu Ala Val Ala Ala Ala Phe Ala Glu Gly Thr Thr Arg He Glu 
340 345 350 
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Asp Ala Ala Glu Leu Arg Val Lys Glu Ser Asp Arg Leu Ala Ala lie 
355 360 365 

Ala Ser Glu Leu Gly Lys Met Gly Ala Lys Val Thr Glu Phe Asp Asp 



370 375 



Gly Leu Glu lie Gin Gly Gly Ser Pro Leu Gin Gly Ala Glu Val Asp 
385 390 395 400 

Ser Leu Thr Asp His Arg lie Ala Met Ala Leu Ala lie Ala Ala Leu 

405 410 



Gly Ser Gly Gly Gin Thr He He Asn Arg Ala Glu Ala Ala Ala He 
420 425 430 

Ser Tyr Pro Glu Phe Phe Gly Thr Leu Gly Gin Val Ala Gin Gly 
435 440 445 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1479 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

■,A) NAME / KEY : CDS 

(B) LOCATION: 107.. 1438 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

TTTAAAAACA ATGAGTTAAA AAATTATTTT TCTGGCACAC GCGCTTTTTT TGCATTTTTT 

CCATTTT ~ CCGG CACAA TAACGTTGGT TTTATAAAAG GAAATG ATG ATG ACG 
•" LLLrt * Met Met Thr 



AAT ATA TGG CAC ACC GCG CCC GTC TCT GCG CTT TCC GGC GAA ATA ACG 
Asn He Trp 



7hr Ala Fro Val Ser Ala leu Ser Gly Glu He Thr 



15 
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ATA TGC GGC GAT AAA TCA ATG TCG CAT CGC GCC TTA TTA TTA GCA GCG 
lie Cys Gly Asp Lys Ser Met Ser His Arg Ala Leu Leu Leu Ala Ala 
20 25 30 35 

TTA GCA GAA GGA CAA ACG GAA ATC CGC GGC TTT TTA GCG TGC GCG GAT 
Leu Ala Glu Gly Gin Thr Glu He Arg Gly Phe Leu Ala Cys Ala Asp 
40 45 50 

TGT TTG GCG ACG CGG CAA GCA TTG CGC GCA TTA GGC GTT GAT ATT CAA 
Cys Leu Ala Thr Arg Gin Ala Leu Arg Ala Leu Gly Val Asp He Gin 
55 60 65 v 

AGA GAA AAA GAA ATA GTG ACG ATT CGC GGT GTG GGA TTT CTG GGT TTG 
Arg Glu Lys Glu He Val Thr He Arg Gly Vai Gly Phe Leu Gly Leu 
70 75 30 

CAG CCG CCG AAA GCA CCG TTA AAT ATG CAA AAC AGT GGC ACT AGC ATG 
Gin Pro Pro Lys Ala Pro Leu Asn Met Gin Asn Ser Gly Thr Ser Met 
85 90 95 

CGT TTA TTG GCA GGA ATT TTG GCA GCG CAG CGC TTT GAG AGC GTG TTA 
Arg Leu Leu Ala Gly He Leu Ala Ala Gin Arg Phe Glu Ser Val Leu 
100 105 HO US 

TGC GGC GAT GAA TCA TTA GAA AAA CGT CCG ATG CAG CGC ATT ATT ACG 
Cys Gly Asp Glu Ser Leu Glu Lys Arg Pro Met Gin Arg He He Thr 
120 125 130 

CCG CTT GTG CAA ATG GGG GCA AAA ATT GTC AGT CAC AGC AAT TTT ACG 
Pro Leu vai Gin Met Gly Ala Lys He Val Ser His Ser Asn Phe Thr 



GCG CCG TTA CAT ATT TCA GGA CGC CCG CTG ACC GGC ATT GAT TAC GCG 
Ala Pro Leu His He Ser Gly Arg Pro Leu Thr Gly He Asp Tyr Ala 

150 155 

TTA CCG CTT CCC AGC GCG CAA TTA AAA AGT TGC CTT ATT TTG GCA GGA 
Leu P-o Leu Pro Ser Ala Gin Leu Lys Ser Cys Leu He Leu Ala Gly 
165 > 170 '-~5 

~TA TTG GCT C-AC GGT ACC ACG CGG CTG CAT ACT TGC GGC ATC AGT CGC 
: eu ; eu Ala -=so Glv Thr rhr Ara Leu Kis Thr Cys Gly He Ser Arg 
ISO ' -35 1?0 195 



GAC CAC ACG GAA CGC ATG TTG CCG ( 



GGC GCA CTT GAG ATC 



Asp His Thr JH Arg Met 



P-o Leu Fhe Glv Glv Ala Leu Glu He 
l5 210 
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AAG AAA GAG CAA ATA ATC GTC ACC GGT GGA CAA AAA TTG CAC GGT TGC 
Lys Lys Glu Gin He lie Val Thr Gly Gly Gin Lys Leu His Gly Cys 
215 220 225 

GTG CTT GAT ATT GTC GGC GAT TTG TCG GCG GCG GCG TTT TTT ATG GTT 
Val Leu Asp He Val Gly Asp Leu Ser Ala Ala Ala Phe Phe Met Val 
230 235 240 

GCG GCT TTG ATT GCG CCG CGC GCG GAA GTC GTT ATT CGT AAT GTC GGC 
Ala Ala Leu He Ala Pro Arg Ala Glu Val Val He Arg Asn Val Gly 
245 250 255 

ATT AAT CCG ACG CGG GCG GCA ATC ATT ACT TTG TTG CAA AAA ATG GGC 
He Asn Pro Thr Arg Ala Ala He He Thr Leu Leu Gin Lys Met Gly 
260 265 270 275 

GGA CGG ATT GAA TTG CAT CAT CAG CGC TTT TGG GGC GCC GAA CCG GTG 
Gly Arg He Glu Leu His His Gin Arg Phe Trp Gly Ala Glu Pro Val 
280 285 290 

GCA GAT ATT GTT GTT TAT CAT TCA AAA TTG CGC GGC ATT ACG GTG GCG 
Ala Asp He Val Val Tyr His Ser Lys Leu Arg Gly He Thr Val Ala 
295 300 305 

CCG GAA TGG ATT GCC AAC GCG ATT GAT GAA TTG CCG ATT TTT TTT ATT 
Pro Glu Trp He Ala Asn Ala He Asp Glu Leu Pro He Phe Phe He 
310 315 320 

GCG GCA GCT TGC GCG GAA GGG ACG ACT TTT GTG GGC AAT TTG TCA GAA 
Ala Ala Ala Cys Ala Gla Gly Thr Thr Phe Val Gly Asn Leu Ser Glu 
225 330 . 335 

TTG CGT GTG AAA GAA TCG GAT CGT TTA GCG GCG ATG GCG CAA AAT TTA 
Leu Ara Val Lvs Glu Ser Asp Arg Leu Ala Ala Met Ala Gin Asn Leu 
340 * " 345 350 355 

CAA ACT TTG GGC GTG GCG TGC GAC GTT GGC GCC GAT TTT ATT CAT ATA 
Gin Thr Leu Gly Val Ala Cys Asp Val Gly Ala Asp Phe He His Tie 



365 370 



-AT GGA AGA AGC GAT CGG CAA TTT TTA CCG GCG CGG GTG AAC AGT . .T 
; /r Gly Arg ser aso Arg Gin Phe Leu Pro Ala Arg Val Asn Ser Phe 
275 " 330 ^ 35 

^GC GAT CAT CGG ATT CCG ATG AGT TTG GCG GTG GCA GGT GTG CGC GCG 
Gly Asp His Ara He Ala Met ier Leu Ala Val Ala Gly Val Arg Ala 
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38-21(10660)A 



GCA GGT GAA TTA TTG ATT GAT GAC GGC GCG GTG GCG GCG GTT TCT ATG 13 63 

Ala Gly Glu Leu Leu He Asp Asp Gly Ala Val Ala Ala Val Ser Met 
405 410 415 

CCG CAA TTT CGC GAT TTT GCC GCC GCA ATT GGT ATG AAT GTA GGA GAA 1411 
Pro Gin Phe Arg Asp Phe Ala Ala Ala He Gly Met Asn Val Gly Glu 
420 425 430 435 

AAA GAT GCG AAA AAT TGT CAC GAT TGATGGTCCT AGCGGTGTTG GAAAAGGCAC 1465 
Lys Asp Ala Lys Asn Cys His Asp 
440 

GGTGGCGCAA GCTT 1479 



(2) INFORMATION FOR SEQ ID NO: 69: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 443 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: 

Met Met Thr Asn He Trp His Thr Ala Pro Val Ser Ala Leu Ser Gly 

I 5 10 15 

Glu He Thr He Cys Gly Asp Lys Ser Met Ser His Arg Ala Leu Leu 



Leu Ala Ala Leu Ala Glu Gly Gin Thr Glu He Arg Gly Phe Leu Ala 
35 40 45 

Cys Ala Aso Cys Leu Ala Thr Arg Gin Ala Leu Arg Ala Leu Gly Val 
50 " 55 60 

Aso He Gin Arg Glu Lvs Glu He Val Thr He Arg Gly Val Gly Phe 
" 55 ' r 0 ~5 30 

: eu Gly Leu Gin Pro Fro Lys Ala Pro Leu Asn Met Gin Asn Ser Gly 
35 90 ?5 

Thr Ser Met Arg Leu Leu Ala Gly He Leu Ala Ala Gin Arg Phe Glu 
•30 105 HO 
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Ser Val Leu Cys Gly Asp Glu Ser Leu Glu Lys Arg Pro Met Gin Arg 
115 120 125 

He He Thr Pro Leu Val Gin Met Gly Ala Lys He 'Val Ser His Ser 
130 135 140 

Asn Phe Thr Ala Pro Leu His He Ser Gly Arg Pro Leu Thr Gly He 
145 150 155 160 

Asp Tyr Ala Leu Pro Leu Pro Ser Ala Gin Leu Lys Ser Cys Leu lie 
165 170 175 

Leu Ala Gly Leu Leu Ala Asp Gly Thr Thr Arg Leu His Thr Cys Gly 
180 185 190 

He Ser Arg Asp His Thr Glu Arg Met Leu Pro Leu Phe Gly Gly Ala 
195 200 205 

Leu Glu He Lys Lys Glu Gin lie lie Val Thr Gly Gly Gin Lys Leu 
210 215 220, 

His Gly Cys Val Leu Asp He Val Gly Asp Leu Ser Ala Ala Ala Phe 
225 230 235 240 

Phe Met Val Ala Ala Leu lie Ala Pro Arg Ala Glu Val Val He Arg 
245 250 255 

Asn Val Gly He Asn Pro Thr Arg Ala Ala He He Thr Leu Leu Gin 
260 265 270 

•vs Met Glv Glv Ara He Glu Leu His His Gin Arg Phe Trp Gly Ala 
275 280 - 285 

Glu Pro Val Ala Asp He Val Val Tyr His Ser Lys Leu Arg Gly He 
290 295 300 

Thr Val Ala Pro Glu Tro He Ala Asn Ala He Asp Glu Leu Pro He 
2 05 310 315 320 

3he phe He Ala Ala Ala Cys Ala Glu Gly Thr Thr Phe Val Gly Asn 
325 % 330 335 

-eu ^er Glu Leu Ara Val Lvs Glu Ser Asp Arg Leu Ala Ala Met Ala 
: 4 0 345 350 



Asn Leu Gin Thr Leu Gly Val Ala Cys Asp Val Gly Ala Asp Phe 
-SS ;60 2 55 
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He His He Tyr Gly Arg Ser Asp 
370 375 

Asn Ser Phe Gly Asp His Arg He 

385 3"90 

Val Arg Ala Ala Gly Glu Leu Leu 
405 

Val Ser Met Pro Gin Phe Arg Asp 
420 

Val Gly Glu Lys Asp Ala Lys Asn 

435 440 



Arg Gin Phe Leu Pro Ala Arg Val 
380 

Ala Met Ser Leu Ala Val Ala Gly 

395 400 

He Asp Asp Gly Ala Val Ala Ala 
410 415 

Phe Ala Ala Ala He Gly Met Asn 
425 430 

Cys His Asp *■ 



